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Abstract 


Title  of  Thesis  :  On  Image  Coding  and  Understanding  : 

A  Bayesian  Formulation  for  the  Problem  of  Template  Matching 
based  on  Coded  Image  Data 
Name  of  degree  candidate  :  Emmanuil  N.  Frantzeskakis 
Degree  and  Year  :  Master  of  Science,  1990 
Thesis  directed  by:  John  S.  Baras, 

Professor,  Electrical  Engineering  Department 

Some  instances  of  the  template  matching  problem,  primarily  for  binary  images  cor¬ 
rupted  with  spatially  white  binary  symmetric  noise,  are  studied.  We  use  the  pixel¬ 
valued  image  data  as  well  as  data  coded  by  two  simple  schemes,  a  modification  of  the 
Hadamard  basis  and  the  coarsening  of  resolution.  Bayesian  matching  rules  residing 
on  M-ary  hypothesis  tests  are  developed.  The  performance  evaluation  of  these  rules 
is  studied. 

This  approach  to  the  matching  problem  is  intended  to  show  the  trade-off  between  the 
quantization  and  external  noise  with  respect  to  the  ability  of  detecting  an  object  of 
the  image.  We  consider  the  case  of  the  black  square  template  in  white  background  or 
without  known  background  as  well  as  synthetic  template  without  known  background. 
We  call  external  noise  the  noise  generated  at  the  moment  we  receive  the  uncoded 
image,  in  which  case  we  have  a  “corrupt-code-detect  system",  or  the  noise  coming 
as  the  effect  of  the  transmission  of  the  coded  image  over  a  noisy  channel,  in  which 
case  we  have  a  “code-corrupt-detect  system”.  In  both  cases  the  noise  is  assumed  to 


be  white. 


The  sum-of-pixels  and  the  histogram  statistics  are  introduced  in  order  to  overcome 
the  computational  load  induced  by  the  correlation  statistic  with  the  penalty  of  an 
augmented  probability  of  false  alarm  rate. 

What  is  intended  to  be  shown  in  the  present  work  is  the  usefulness  and  ability  of 
combining  an  image  coding  technique  with  an  algorithm  for  extracting  some  “base” 
information  used  in  image  understanding.  Numerical  and  simulation  results  are  given. 
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Chapter  1 

The  Template  Matching 
Capability  as  a  Measure  of 
the  Image  Quality 

1.1  The  Template  Matching  as  Part  of  the  Image 
Understanding  Problem. 

The  “Image  Understanding  Problem”  poses  the  question  of  understanding 
what  represents  a  given  two-dimensional  image.  The  scope  of  the  problem  can 
further  be  expanded  if  we  consider  a  sequence  of  such  images  at  subsequent  time 
instances,  and  questioning  about  time-dependent  events,  like  the  relative  motion 
of  the  objects  represented. 

Consider  now  a  special  case  of  the  problem  for  fixed  time  parameter  and 
a  multiple  gray  level  digital  image.  A  general  structure  of  a  system  proposed 
by  Marr  [Ros90]  for  solving  this  special  case  of  the  understanding  problem  is 
shown  in  figure  1.1.  The  input  digital  image  usually  is  an  array  of  pixels  whose 
values  are  the  gray  levels  representing  the  brightness  at  a  closely  spaced  grid  of 
points.  Segmentation  and  feature  detection  can  be  regarded  as  assigning  labels 
to  the  image  pixels  indicating  classes  to  which  the  pixels  belong,  like  light/dark, 
edges/non  edges,  being  a  member  of  some  known  pattern  or  not  e.t.c.  So,  after 
segmentation  and  feature  detection,  a  symbolic  image  in  which  pixel  values  are 

labels  is  obtained. 
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T  Object  Description 


Figure  1.1  Structure  of  a  generic  image  understanding  system  as  proposed  by  Marr 


2 


Labeled  pixels  satisfying  some  specific  criteria  are  grouped  together  into 
image  parts  resulting  to  a  new  (re)segmentation  output  and  properties  of  these  parts 
like  area,  orientation  e.t.c.  are  measured;  in  this  way  a  relational  graph  is  formed. 
Now  the  object  recognition  problem  can  be  viewed  as  finding  subcollections  of  the 
image  parts  whose  properties  and  relations  satisfy  certain  constraints.  The  image 
understanding  problem  still  requires  information  about  the  relations  among  these 
objects  in  the  given  image,  and  this  wouldn’t  be  difficult  to  find  if  the  scheme 
described  so  far  could  deal  satisfactorily  with  real  world  images. 

Unfortunately,  this  is  not  the  case.  Some  of  the  reasons  for  this  failure  are  : 

1 .  The  process  discussed  is  a  strictly  bottom-up  one;  human  vision  has  a  mixed 
top-down  and  bottom-up  nature.  That  is  we  simultaneously  process  important 
local  characteristics  as  well  as  consider  global  features  of  an  object  when  we 
perform  object  recognition. 

2.  The  algorithms  which  implement  the  tasks  constituting  the  recognition  process 
discussed  are  often  ad-hoc  and  heuristic  and  consequendy  do  not  admit  sys¬ 
tematic  performance  evaluation.  Typically  the  latter  is  based  on  experiments 
and  simulations  with  small  samples  resulting  to  misleadingly  optimistic  per¬ 
formance  predictions. 

3.  The  noise  effects  which  are  present  in  real  world  images  usually  are  not 
adequately  or  at  all  taken  into  account. 
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1 .2  Approaches  to  the  Template  Matching  Problem 

Let  I  be  an  nxn-pixel  multiple  gray  level  image  (with  E  gray  levels)  and  T  be 
an  m  xm-pixel  template  pattern.  The  exact  template  matching  problem  amounts  to 
finding  all  (i J)  positions  in  I,  i,j  =  l,2,...,n-m+l,  such  that  I[i+k,j+l]=T[k4],  k,l  = 
0,l,...,m-l.  It  is  easy  to  see  that  this  problem  is  equivalent  to  the  string  matching 
problem  with  text  string  length  :  ni=n2  and  pattern  string  length  mi=m2.  So,  we 
can  take  advantage  of  the  existing  literature  on  this  problem  and  solve  it  in  time 
O(ni)  [KMP77,  BoMo77],  or  in  time  O(logEnilogmi)  [FiPa74],  where  E  is  the 
number  of  gray  levels  in  the  image. 

Let’s  consider  now  a  binary  image  I,  assigning  0  and  1  as  the  values  of  white 
and  black  pixels  respectively,  which  is  corrupted  by  Binary  Symmetric  Noise 
with  pixel  inversion  probability  equal  to  f.  Suppose  that  given  an  image  I  and  an 
mxm  window  on  it  ,  call  it  lw[yj  (i.e.  Iw[ij][k,l]  =  l[i+kj+l],  k4=0,l,...,m-l),  we 
have  to  decide  which  one  of  two  possible  templates  Ti,  T2  should  be  the  original 
noise-free  image  window,  represented  by  the  observed  data  Iw.We  will  follow  a 
statistical  interpretation  as  in  [DuHa73].  We  have  : 


pt.  jj  |  Tj  }  —  (1  _  r  j#0/  ntalch  my  mint  s  ^  non  matching  values 

—  (  1  _  r  jl-|I[i+k.j+l]-Ti[k.l]|  x  .|I[i+t-.j+/]-Ti[*-./]| 


k.I 


and  we  can  get  a  similar  expression  for  Pr  {Iw  |  }  .In  practice  we  have 

f<0.5  and  so  the  maximum  likelihood  rule  : 


(If  <■/<!(  T!  i  f  Pr  {Iw  i  Ti  } 


>  Pr  { Iw  j  T-j  }  .  or  <  lx 


d<  1  nl(  T^> 
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reduces  to 


d(  ddf  for  tlx  t(  inpl(it(  with  tlx  l<  axt  mi  min  r  of  mismatc<  s 
with  tlx  iriixlow  Iw 

Pictorially  we  have  the  decision  regions  given  on  figure  1.2.: 


The  space  of  all  binary  m  x  m  patterns 


Figure  1.2  Picture  of  the  decision  regions  for  two  candidate  templates 


More  practically,  let  us  suppose  that  we  are  given  only  one  template  T  in  order 
to  decide  whether  it  does  or  does  not  match  the  image  window  Iw.  Intuitively 
a  “maximum  likelihood”  way  of  thinking  would  lead  us  to  a  decision  region 
looking  like  a  sphere  in  the  space  of  nr  -  hi  nor  a  patterns  centered  at  T  and 
having  radious  some  integer  K.  In  simple  words  we  will  decide  matching  if  the 
number  of  mismatches  is  at  most  equal  to  K.  The  distance  implied  by  the  sphere 
is  the  Hamming  distance  dn(Iw.T)  between  Iw  and  T.  This  situation  is  pictorially 
shown  in  the  figure  1.3: 

For  the  case  of  images  corrupted  by  additive  white  noise  our  problem  is 
equivalent  to  the  string  matching  problem  with  K  mismatches  and  therefore  it 
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can  be  solved  in  time  0(n]logmj)  [AmFa89].  Here  we  will  not  consider  the 
computational  complexity  of  the  problem;  we  will  rather  pose  the  question  of 
determining  some  “optimal”  value  for  K,  given  some  information  about  the  source 
of  disruption  we  want  to  overcome  by  introducing  a  mismatch  tolerance.  More 
specifically,  in  chapter  2  we  find  K  as  a  function  of  the  noise  parameter  e;  this 
question  has  to  do  with  the  reliability  of  the  matching  rule. 


Figure  1.3  Decision  regions  using  Hamming  distance 

Let  us  return  now  to  the  case  of  multiple  gray  level  images.  We  still  have  fast 
algorithms  for  solving  the  K-mismatches  problem  as  a  string  matching  problem 
(time  O  (»i  x/wTlog  »?i )  [Abr87]).  In  [AmFa89]  the  problem  of  finding  the 
occurrences  of  a  non  rectangular  pattern  of  height  m  and  area  o  in  an  nxn  text 
with  no  more  than  K  mismatches  is  considered  and  it  is  shown  to  be  solved  in 
time  ()  ( A  ir  yJTn  log  ni  -f  K~m~ ). 

The  number  of  mismatches  is  a  descriptive  statistic  when  applied  to  text 
strings;  given  that  all  the  letter  transitions,  e.g.  an  “a”  turned  into  “e”,  an  “a” 
turned  into  “q”  e.t.c.,  are  treated  the  same  way,  the  number  of  these  transitions  is 
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good  enough  to  describe  the  corruption  occurred.  On  the  other  hand  for  the  case 
of  the  template  matching  in  multiple  gray  level  images  we  wouldn’t  like  to  treat 
transitions  among  all  gray  levels  the  same  way,  since  transitions  among  neighbor 
gray  levels  are  more  likely  to  occur. 

There  are  various  metrics,  or  distortion  measures  used  in  classical  template 
matching  approaches  in  order  to  account  for  the  lack  of  uniformity  just  mentioned. 
Some  of  them  [DuHa73]  are  : 

1.  The  absolute  distance  :  |I  [/'  -j-  k.j  +  1]  -  T  [k.  /]| 

k.l 

2.  The  Eucledian  distance  :  ( I  [/'  +  k.  j  +  /]  —  T  [k.  /]  )2 

k.l 

3.  The  cross  correlation  :  R  ( i.j )  =  ]T  I  [/'  +  k.  j  +  /]  x  T  [k.  /] 

k.l 

This  is  equivalent  to  the  Eucledian  distance  if  we  assume  constant  picture 

energy  in  the  window,  i.e.  that  ]T  I2  [/  +  k.  j  +  /]  is  independent  from  the 

k.l 

position  (i,j).  It  can  efficiently  be  computed  by  using  FFT. 

4.  The  normalized  cross  correlation  :  X  {t.j)  =  f? (/.  /)/  V  I2  [/  +  k.j  +  /] 

k.l 

We  can  now  repeat  the  previous  method  used  with  the  Hamming  distance. 
That  is  suppose  we  are  given  two  candidate  templates  Tj  and  T2  to  associate  with 
Iw  We  can  use  one  of  the  above  distortion  measures  in  order  to  compare  them 
and  come  up  with  a  decision.  Second,  if  we  have  a  single  template  to  decide 
about  matching,  we  will  compare  the  (generalized)  distance  d(Iw,T)  with  some 
threshold,  which  can  be  interpreted  as  the  radius  of  some  sphere  centered  at  T. 

If  we  introduce  “maximum  likelihood”  reasoning  for  the  multiple  gray  level 
image,  with  I!  gray  levels,  the  role  of  the  metric  (the  Hamming  distance  for  the 
case  of  the  binary  image)  is  undertaken  by  the  so  called  "transition  probability”; 
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this  is  based  on  a  vector  of  different  kind  of  matches  /mismatches  among  the 
gray  levels,  with  E(E-l)  degrees  of  freedom. 

In  chapters  2  and  4  the  Bayesian  reasoning  is  introduced  for  the  template 
matching  problem  and  is  further  discussed  in  chapter  5.  In  the  Bayesian  approach 
the  “transition  probability”  metric  is  used  and  the  decision  regions  for  matching 
are  determined  so  that  a  cost  function  is  minimized. 


1.3  Performance  Evaluation  of  "Matching”  Algorithms 
Acting  Upon  Noisy  Image  Data 

Consider  the  following  simple  instance  of  the  Template  Matching  problem:  A 
binary  image  represents  a  black  square  of  size  mxm  pixels  in  white  background. 
We  want  to  find  the  mxm  windows  on  the  image  containing  part  or  all  of  the 
black  square.  Evidently  it  suffices  to  detect  a  single  black  pixel  in  the  window. 
We  still  can  compute  the  correlation  R  of  the  test  window  with  the  “full  black 
template”,  i.e.  with  a  square  template  of  size  mxm  pixels  consisting  entirely 
of  black  pixels  (value  =  1);  R  actually  gives  the  number  of  black  pixels  in  the 
window  and  at  the  same  time  it  is  a  measure  of  overlap  of  the  test  window  with 
the  black  square.  We  can  decide  “part  or  all  of  the  square  detected”  if  R  >0  and 
“square  not.  detected”  otherwise. 

Suppose  now  that  the  image  is  transmitted  over  a  noisy  channel;  as  an  effect 
of  this  several  pixels  are  inverted.  We  model  this  phenomenon  by  saying  that 
the  image  is  corrupted  by  White  (in  space)  Binary  Symmetric  Noise  with  some 
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inversion  probability  <=.  From  now  on  we  will  call  this  kind  of  noise  “BSC  noise” 
since  it  comes  as  an  effect  of  the  Binary  Symmetric  Channel  : 


0- 


1  -  e 

► 


0 

1 


Figure  1.4  The  Binary  Symmetric  Channel 


Again,  given  a  window  of  the  noise  corrupted  image  we  want  to  infer  about 
the  existence  of  the  black  square  in  it.  Now  it  does  not  suffice  to  detect  just  one 
black  pixel.  Though  we  remind  the  reader  that  when  we  are  talking  about  the 
noise  free  image  the  number  of  the  black  pixels  in  the  window  gives  a  measure  of 
the  overlap  of  the  window  with  the  square.  We  will  use  this  number  in  order  to 
answer  the  question  about  the  decision  for  a  square  or  not.  Effectively,  we  have 
to  define  some  threshold  t  in  the  range  of  0  to  nr  which  will  determine  the  region 

/?i  —  t.t  +  1 . nr  for  a  positive  answer  to  the  question  posed.  We  note  that 

in  the  noise  free  image  we  had  t  =  1  and  /?]  =  1.2 . nr  . 

For  a  given  level  of  noise  f  we  want  to  determine  the  threshold  value  t  in 
such  a  way  that  the  probability  of  erroneous  decision  be  minimized;  this  error 
probability  will  be  a  linear  combination  [Nar89]  of 

Pflt  =Pr{<I(tuh  thin  /*  a  jinni  iclult  it  /.■*  not  />/•<  *<  nt}  and 

P„,  —  Pr  {d(  cult  tlnn  /.*  not  a  *ijn<ir<  irhih  it  i pri  ><  nt } 
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The  performance  evaluation  we  want  to  make  for  such  an  optimal  rule  amounts  to: 

1  Draw  the  ( Pf„ ,  P,i )  curve  of  the  optimal  Bayesian  rules  parametrized  by  the 
noise  parameter  f,  where  Pfa  stands  for  probability  of  “false  alarm”,  and  Pd 
stands  for  probability  of  “detection”,  i.e.  probability  of  deciding  there  is  a 

square  while  it  is  present,  Note  that  P(i  =  l  -  Pm- 
2.  Draw  the  Receiver  Operating  Characteristic  (ROC)  curves  of  the  test,  i.e. 

th ePj(Pf„)  curves,  for  certain  values  of  f. 

In  order  to  obtain  the  above  we  use  a  “hypothesis  testing”  approach,  broadly 
used  in  communications  detection  theory  [P0088,  PP7].  A  description  of  the 
procedure  used  to  build  such  decision  rules  is  given  in  chapter  2.  At  this  point 
we  wiU  only  state  that  the  square  detection  problem  initially  leads  to  an  M-ary 
hypothesis  test;  M  here  equals  the  number  of  different  Hamming  weights1  in  all 
possible  windows  of  the  noise  free  image.  This  test  is  in  turn  reduced  to  a  binary 
hypothesis  test.  The  curves  1  and  2  required  for  the  performance  evaluation  of 
the  test  are  given  in  plot  3  and  plot  4  respectively  in  appendix  B  (the  size  of  the 
square  used  for  them  is  m=8);  a  discussion  of  these  curves  will  be  done  in  the  next 
chapter.  In  the  table  1.1  a  set  of  (f,t)  pairs  is  given  showing  the  effect  of  noise 
upon  the  threshold  selected  for  the  optimal  Bayesian  rules  (recall  :  t  <  nr  =  04 
and  in  the  ideal  no  noise  case  f  =  D- 


1  Hamming  weight  of  the  window  here  is  the  number  of  black  pixels  in  it. 

mxiii.'  +  l'  .  1 

A.l  it  is  shown  that  \l  =  -  *  • 


In  the  subsection 
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epsilon 

threshold 

0.02 

5 

0.1 

13 

0.2 

21 

0.3 

29 

0.4 

37 

Table  1.1  The  threshold  as  a  function  of  the  noise 


1 .4  Image  Coding  Techniques  Appropriate  for  Input 
to  a  Template  Matching  Algorithm 

In  the  last  few  years  a  great  variety  of  image  coding  techniques  has  appeared  in 
the  literature.  The  interested  reader  should  refer  to  [Neli80],  [NaKi88],  [KIK85], 
[KBL87].  The  compression  obtained  reaches  the  level  of  0.25  bpp  (bits  per 
pixel),  e.g.  by  using  a  technique  of  Entropy  Coded  Quantization  for  subband 
Image  Coding  [TaFa86], 

For  our  purposes  however  we  need  coding  techniques  producing  data  accept¬ 
able  as  input  to  a  template  matching  algorithm.  But  since  such  algorithms  scan  the 
image  and  perform  some  kind  of  local  operations  on  it  (i.e.  operations  acting  on  a 
window  of  the  image  and  not  on  the  entire  extent  of  it)  we  should  restrict  ourselves 
to  coding  techniques  for  which  the  coded  data  preserve  the  local  characteristics 
of  the  uncoded  image;  therefore  we  should  invoke  some  block  coding  technique. 

A  block  coding  scheme  splits  the  image  into  blocks  of  some  size  (e.g.  block 
size  of  4x4  pixels)  and  encodes  each  block  separately  of  all  the  others  as  a  vector 
of  random  variables.  The  coded  image  will  be  a  matrix  of  size  c„  x  c„  blocks 
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where  c„  = ’’original  image  dimension”  /  “block  dimension”  (both  assumed  to 
be  square).  Each  element  of  the  matrix  is  the  coded  data  for  the  corresponding 
image  block.  For  example  a  256x25 6-pixel  image  coded  in  blocks  of  size  4x4 
pixels  will  result  in  a  coded  image  of  size  c„  x  r„  =64x64  codewords.  There  is 
a  number  of  different  approaches  to  block  coding  that  have  been  proposed.  Here 
we  recall  just  a  few;  the  ones  which  appear  to  be  both  common  in  the  literature 
and  useful  for  the  present  work. 

Suppose  that  the  pixel  values  in  the  block  form  an  N-component  vector  X. 

1 .  In  [KIK85]  the  Karhunen-Loeve  Transform  (KLT)  is  used  to  produce  a  vector 
Y  of  uncorrelated  coefficients  out  of  the  vector  X;  then  the  coefficients  of  Y 
are  quantized  by  using  the  Loyd-Max  quantizer,  while  the  number  of  bits 
assigned  to  each  component  is  a  function  of  its  variance  and  the  total  number 
of  bits  available  for  coding  the  vector  Y.  We  will  briefly  explain  how  KLT 
functions  (note  also  that  the  Loyd-Max  quantizer  is  presented  in  [Max60])  : 
KLT  amounts  to  the  transformation  :  Y  =  A.X,  where  A  is  an  NxN  matrix 
computed  as  follows  : 

First  find  the  covariance  matrix  of  X  :  Cx=E[(X-EX)(X-EX)T^ .  The  rows  of 
A  are  the  normalized  eigenvectors  of  the  matrix  Cx,  i.e.  they  are  solutions  of 
the  equation  CxX=AjX  where  the  -Vs  are  the  eigenvalues  of  Cx.  If  we  want 
to  reject  K  out  of  N  coefficients  we  may  do  so  by  leaving  out  the  eigenvectors 
which  correspond  to  the  K  smaller  eigenvalues  of  Cx.  Then  the  mean  square 
reconstruction  error^  will  be  equal  to  the  sum  of  these  eigenvalues.  This  is 
the  minimum  value  we  can  obtain  out  of  a  transform  of  the  form  Y  —  A.X 

:  E[(X-A-'Y)t(X-A-'Y)] 
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[NeLi80].  The  KLT  has  two  problems  :  It  defines  A  in  terms  of  the  Cx  matnx 
which  in  general  is  not  stationary.  Also  the  eigenvectors  of  Cx  are  not  always 
distinct  [NeLi80J. 

2.  In  [MAY82]  the  coefficients  of  X  are  normalized  over  their  variances  and 
the  output  vector  Y  is  vector-quantized.  A  review  of  Vector  Quantization 
(VQ)  is  presented  in  [Gra84],  while  applications  in  image  coding  are  given 

in  [NaKi88] 

3.  A  popular  transform  substituting  to  some  extent  the  KLT  is  the  Discrete 
Cosine  Transform  (DCT)  which  overcomes  the  problems  mentioned  for  the 
KLT  [ANR74], 

In  the  DCT  we  have  Y  =  A.X,  where 

°K(i)  /  7T 

.4  =  {ojj } .  d,j  =  cos  ( 2,y  +  1 )  • 

(i/s/T.  ;  =  i 

A'  ( / )  =  <  1  /  =  2 . -Y 

l  0  ofhtnrist 

4.  The  Symmetric  Hadamard  Transform  [NeLi80]  for  V  =  2" defined  by 

.Y-l 

.4=  {«„}.  o„  =  -±=(-l)h[lj).  Y.W 

V-'  1=0 

where  //.  ii  are  the  bit  values  in  the  binary  representation  of  i  and  j  respectively 
in  N  bits.  For  example,  for  n=2  we  have  lm  =  <>0(>lj.  2in  =  (where 

kb  stands  for  “representation  of  the  number  k  in  base  b  ), 


.(MI-flMl  +  IMI+l-l 


1  1  IMl  +  IMl+l  ll-flM  __  _ 

- <M  J  ~  — 7=  ~  9 

2 
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n  i 


i  i 


and  similarly  we  can  find  At  =  7 


1  -1  1 

1  1  -1 


.  If  we  take  the  4 


row 


L1  -1-1  1  „ 
vectors  and  split  each  one  into  two  columns  we  obtain  the  2x2  Hadamard 


basis  given  in  figure  1.5a.  The  4x4  Hadamard  basis  is  given  in  figure  1.5b. 


Another  constraint  posed  in  the  coding  procedure  is  the  requirement  that  all 
the  blocks  should  be  represented  by  the  same  number  of  bits.  In  other  words 
we  should  not  exploit  the  Information  Theoretic  properties  of  the  blocks,  as  the 
Hufftnan  codes  do  [Bla87,pp64].  This  is  required  because  at  the  template  matching 
time  we  will  need  to  know  the  precise  location  of  each  block  in  the  image  file.  If 
on-line  search  is  required  we  will  suffer  considerable  time  delays.  On  the  other 
hand,  if  auxiliary  pointers  are  used  to  avoid  this  delay,  then  the  space  advantage 
may  be  lost.  Anyway  the  trade-off  between  the  speed  and  data  compression  will 
not  be  considered  at  this  point.  At  this  point  we  will  require  fixed  length  code. 

A  final  constraint,  which  is  imposed  due  to  the  matching  algorithm  complex¬ 
ity,  is  the  requirement  for  the  least  possible  size  of  the  block-codebook.  This  will 
be  discussed  further  in  chapter  3. 


14 


1.5  The  Combined  Compression  —  Feature  Detection  problem 

Consider  the  following  senario  :  We  have  a  source  image  I,  on  which  we 
can  perform  a  template  matching  algorithm  A  in  order  to  extract  some  feature. 
Alternatively  we  can  compress  the  image  producing  the  compressed  version  of  it 
Q(I);  after  this  we  can  reconstruct  it  producing  R(I)  from  Q(I)  and  we  perform 
again  the  algorithm  A  on  R(I).  We  expect  that  a  “good”  compression  procedure 
will  not  affect  the  “quality  of  the  image”,  or  it  will  not  damage  the  “information 
content  of  the  image”.  More  explicitly  we  would  like  that  the  algorithm  A  acting 
on  I  give  the  same  output  as  acting  on  R(I). 


Figure  1.6  The  feature  detection  problem 


So  the  effect  of  the  template  matching  algorithm  can  be  used  to  check  the 
“information  persistence”,  i.e.  the  features  identification  detection  capability  after 
the  compression/reconstruction  process.  (We  note  that  implicitly  we  admit  here 
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that  the  “information”  is  also  dependent  on  the  detection  algorithm  A.)  In  general 
we  expect  that  R(I)  will  contain  less  information  (will  provoke  more  decision 
errors  when  A  is  applied)  than  I  contains. 

Here  we  consider  quantization  as  a  simple  form  of  compression.  We  also 
implicitly  assume  that  the  primary  purpose  of  compression  is  to  speed  up  trans¬ 
mission  of  the  frame,  or  allow  as  to  store  it  in  an  efficient  way. 


Figure  1.7  The  “corrupt-code-detect'’  model 


As  it  is  depicted  in  the  figure  1.6  we  are  interested  in  feature  detection 
algorithms  Aq  acting  on  the  coded  data  Q(I).  More  specifically:  Can  we  have 
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any  advantage  if  we  apply  a  feature  detection  algorithm  Aq  on  the  coded  data 
Q(I)  instead  of  applying  A  on  the  reconstructed  data  R(I)  ?  Or  still,  can  we  have 
any  advantage  if  we  apply  Aq  on  the  coded  data  Q(I)  instead  of  applying  A  on 
the  original  data  I  ?  The  first  question  may  be  answered  positively  even  at  an 
intuitive  level  .We  will  argue  for  a  positive  answer  to  the  second  question  as  well. 

Let  us  consider  a  more  detailed  diagram,  see  figure  1.7,  in  which  we  are  not 
interested  in  the  reconstructed  image  R(I);  the  noise  free  multiple  gray  level  image 
can  be  characterized  by  its  first  order  histogram,  i.e.  the  relative  frequencies  of 
the  gray  levels;  these  frequencies  (normalized  to  sum  to  unity)  will  be  called 
prior  probabilities  or  just  priors.  The  introduction  of  noise  means  that  several 
pixels  of  certain  gray  levels  will  be  changed  into  some  other  levels.  We  will  call 
the  probabilities  of  such  changes  transition  probabilities ;  these  are  determined 
by  the  noise  parameters.  The  compression  procedure  gives  rise  to  some  new 
entities:  the  blocks,  which  have  their  own  “priors”  found  rather  experimentally, 
and  their  “transitions”,  which  are  determined  by  the  pixel-level  transitions  and 
the  compression  mle  itself.  What  we  want  to  argue  about  is  that  the  distortion 
introduced  by  the  compression  scheme  may  “kill”  part  of  the  external  noise 
(transmission  channel  noise)  and  therefore  the  compressed  image  Qfl)  may  be 
more  reliable  than  the  noise  corrupted  N(l)  one.  Finally,  the  matching  algorithms 
A  and  Aq  will  be  applied  on  data  of  the  size  of  the  template  and  give  the  positions 
in  the  image  where  a  “match”  is  decided.  A  window  of  an  image  having  the  size 
of  the  template  has  also  its  “priors”  and  “transitions”,  which  are  determined  by 
the  ones  of  the  block-level  as  well  as  the  structure  of  the  template. 
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Suppose  we  use  as  coding  scheme  the  Karhunen-Loeve  transform  described 
earlier.  The  vector  Y  reresenting  the  data  for  an  image  block  contains  uncorrelated 
coefficients  which  are  real  valued  numbers.  We  scalar-quantize  each  one  of 
them.  As  we  see  in  figure  1.8  too  fine  quantization  with  respect  to  the  noise 
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parameter  (e.g.  the  variance  a2  for  the  case  of  Gaussian  noise)  is  useless,  since 
precision  is  lost  because  of  the  external  noise  (large  shaded  areas  correspond  to 
the  error  probabilities).  Somewhat  coarser  quantization  may  “kill”  part  of  the 
error  introduced  by  the  channel.  Finally,  too  coarse  quantization  means  that  the 
quantization  error  dominates  the  channel  error  and  we  have  even  more  loss  in  the 
precision.  Now,  if  we  suppose  that  the  coefficients  in  Y  are  independent  random 
variables,  the  quantized  vector  1 '  will  also  have  independent  components  whose 
transitions  are  known.  So  the  block-transition  probabilities  can  be  found  as  the 
products  of  the  components  (pixel  level)-transition  probabilities.  Also  the  image 
windows  can  be  thought  as  vectors  with  independent  block-components  and  so 
we  can  find  their  “transition  probabilities”  as  well  as  their  “priors”. 


Hamming  weight 

quantization  value 

<  2 

white 

>  2 

black 

=  2 

flip  a  fair  coin  and  quantize  according 

to  the  output  value 

Table  1.2  A  simple  quantization  scheme 


Although  straightforward  the  above  scheme  of  quantization/coding  is  unreal¬ 
izable  because  of  the  very  large  size  of  the  block  codebook,  which  as  described 
at  the  end  of  the  previous  subsection  should  be  small.  A  quantization  scheme 
which  is  simultaneously  simple  and  realizable  is  the  coarsening  of  the  scale  of 
the  image.  Consider  a  binary  image  corrupted  by  BSC  noise  with  parameter  f 
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and  quantize  each  2x2  pixel  block  into  a  full  black  or  full  white  block  as  follows 
described  in  table  1.2: 

Here  we  have  an  effective  compression  rate  equal  to  0.25  bpp.  The  codebook 
size  is  2.  The  block-priors  are  both  equal  to  0.5.  The  transition  probabilities  are 
derived  in  chapter  2.  We  can  now  use  the  compressed  image  to  test  for  match 
with  the  full  black  template.  The  ROC  of  the  test  is  given  in  the  plot  6.  We  note 
that  the  ROC  curve  is  closer  to  the  point  (0,1)  than  the  one  in  the  plot  4  (at  the 
specific  noise  level  f=0.1),  which  shows  that  the  test  on  the  compressed  image  is 
a  “more  reliable”  test  as  it  was  anticipated. 

1 .6  Identification  of  the  Problem  Studied 

Image  compression  and  image  understanding  have  been  traditionally  devel¬ 
oped  as  separate  fields  of  research.  In  image  compression  one  is  primarily  in¬ 
terested  in  designing  efficient  coding  schemes  which  allow  the  transmission  and 
accurate  reconstruction  of  images  at  low  bit  data  rates.  In  image  understanding 
one  is  primarily  interested  in  extracting  high  level  or  “content”  information  from 
the  image  pixels.  It  is  clear  that  both  fields  address  the  problem  of  efficient  in¬ 
formation  extraction  and  that  in  principle  they  are  strongly  coupled.  We  believe 
that  in  order  to  understand  the  top-down  and  bottom-up  processing  performed  in 
human  vision,  one  needs  to  unify  these  two  fields.  It  is  the  purpose  of  this  the¬ 
sis  to  perform  an  initial  investigation  in  this  direction.  More  specifically  we  are 
interested  in  linking  image  coding  and  object  recognition  quantitatively  in  some 
simple  generic  examples.  Such  progress  is  necessary  for  our  long  range  goal  of 
unifying  image  compression  and  image  understanding. 
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To  be  more  specific  we  have  considered  recognition  based  on  a  well  structured 
feature  detection  algorithm,  which  we  developed,  resembling  Template  Matching 
algorithms.  We  first  started  discussing  the  template  matching  problem  and  we  shall 
present  our  feature  detection  algorithm  in  chapters  four  and  five.  In  trying  to  link 
image  processing  with  image  understanding  we  introduce  a  novel  idea.  Namely 
the  “numerical  image”  for  us  is  not  necessarily  a  pixel  image  but  a  block-coded 
image.  In  this  way  the  image  compression  (performed  by  the  code)  is  linked 
directly  to  image  understanding.  We  can  now  expect  to  observe  the  two  types  of 
interference  or  contamination  induced  on  the  image  data;  namely  the  “external 
noise”  introduced  by  the  physical  media  (e.g.  the  transmission  channel),  and  the 
“quantization  noise”  introduced  by  the  coding  procedure  (see  fig  1 .6).  What  we 
are  interested  to  analyze  is  the  process  where  image  recognition  can  be  performed 
based  on  the  reduced  (coded)  image  data.  It  is  clear  that  if  we  reduce  the  image 
beyond  certain  point,  recognition  capability  will  be  affected.  On  the  other  hand  it 
is  desirable  from  a  practical  point  of  view  to  design  schemes  which  can  perform 
object  recognition  on  the  basis  of  block-coded  data  and  not  requiring  the  full 
image  reconstruction.  It  is  clear  that  what  we  need  to  understand  and  quantify 
is  the  explicit  relationship  between  code  efficiency  and  image  compression  with 
the  required  performance.  So  the  reader  should  not  expect  to  find  in  this  work  a 
review  of  the  image  coding  techniques  or  a  proposed  solution  of  the  2-D  Image 
Understanding  problem. 

In  the  first  chapter  we  introduced  the  notions  of  image  understanding,  template 
matching,  hypothesis  testing,  image  coding  and  block  coding  for  images.  We  also 
identified  the  problem  we  will  focus  on  and  put  it  in  the  broader  perspective  of 
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the  image  understanding  problem.  We  further  gave  examples  of  the  particular 
aspect  we  will  follow  throughout  this  work. 

In  the  second  chapter  we  show  how  the  template  matching  problem  in  a 
priori  known  background,  based  on  non  coded  image  data  can  be  formulated  as  a 
hypothesis  testing  problem  and  we  describe  the  effects  of  noise  on  the  reliability 
of  the  matching  test. 

In  the  third  chapter  we  describe  a  coding  procedure  upon  which  a  feasible 
matching  test  may  be  based;  we  give  the  block-priors  and  transitions.  Related 
coding  schemes  which  can  be  used  in  the  future  for  possible  extensions  are 
mentioned. 

In  the  fourth  chapter  the  template  matching  problem  on  coded  image  data  is 
formulated  as  a  hypothesis  testing;  the  noise  effects  are  studied. 

In  the  fifth  chapter  the  Bayesian  nature  of  the  developed  tests  is  highlighted; 
several  sample  tests  are  numerically  evaluated  so  that  the  noise,  compression  and 
background  knowledge  effects  be  comparatively  studied.  Related  problems  are 
specified  and  further  questions  are  posed. 

Explanations  on  the  algorithms  implemented  in  actual  computer  code  and 
performance  evaluation  plots  summarizing  our  analysis  are  given  in  the  appendix. 
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Chapter  2 

Template  Matching  on  a 
Binary  Image 

2.1  Binary  image  Corrupted  by  AWGN: 

Formulation  of  the  Template  Matching  Problem  as 
a  Hypothesis  Testing  Problem 

We  consider  the  following  problem.  A  binary  image  represents  a  black  mxm- 
pixel  square  in  white  background.  The  image  is  transmitted  over  a  noisy  channel 
that  causes  some  alteration  onto  the  pixel  levels.  We  model  this  phenomenon  by 
saying  that  the  image  is  corrupted  by  Additive  White  Gaussian  Noise  (AWGN) 
of  some  variance  a~;  we  will  shortly  show  the  interpretation  of  this  assumption. 
We  want  to  find  the  m  xm-pixel  windows  of  the  image  containing  part  or  all 
of  the  black  square.  We  will  call  this  problem  “the  full-black  template  case  in 
background”  of  the  template  matching  problem. 

We  first  find  the  possible  relative  positions  between  the  target  square  in  the 
noise  free  image  and  a  test  window  scanning  this  image.  Consider  the  case  of 
m=2.  Then  all  the  possible  patterns  of  part  of  a  black  square  that  could  appear 
in  a  2x2  test  window  are  shown  in  figure  2.1. a. 

Apart  from  the  all-white  window,  we  notice  that  all  possible  patterns  are 
constructed  by  placing  the  left-up  comer  of  the  window  in  each  one  of  3  x  3  =  9 
places  of  a  square  part  containing  the  black  square  (see  fig.  2.1.b)  and  reading  all 
possible  2x2  squares.  This  observation  applies  for  the  general  case  too,  so  we  get 
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(2m- 1 )  x  (2m- 1 )  non  all-white  windows  of  size  m  xm  when  searching  for  the  m  xm 
black  square.  We  also  notice  that  for  this  case  we  have  {2m  -  1 )“  +  1  =  i\/  +  1 
distinct  patterns  out  of  the  2l,r  possible  ones. 

The  M+l  distinct  window  patterns  may  constitute  a  set  of  states 
H[).  H\.  ■  ■  ■ .  H  si  (assign  to  the  white  pattern)  in  a  likelihood  ratio  test 
[Poo88,ppl0].  The  state  H,  will  be  characterized  by  a  binary  vector  S|  of  length 
nr.  The  observation  will  be  the  image  window  itself  represented  by  a  real  valued 
vector  y  of  length  nr.  Corruption  by  AWGN  means  that  for  each  component  yJ 
of  y  we  have  : 

yJ  =  signal  4-  noise  =  s'  +  nJ .  nJ  ~  N  (0.<7') 

and  all  nJ,s  are  mutually  independent. 


0 

o  o 
o  o 


1  2  3 

o  o  o  o  o  o 

OX  XX  X  o 

4  5  6 

OX  XX  X  o 

OX  XX  X  o 

7  8  9 

OX  XX  X  o 

O  O  O  O  0  0 

(a)  "o 


1  2  3 

o  o  o 

4  5  6 

O  X  X 

7  8  9 

O  X  X 

o  o  o 

(b) 

white  pixel,  "x"  :  black  pixel 


o 

o 

o 

o 


Figure  2.1  Possible  patterns  in  a  2x2-pixel  test  window 


We  now  compute  the  prior  probabilities  of  the  states  H,.  /  =  0. 1 . 

For  an  nxn-pixel  image  we  have  a  total  of  (//  —  m  +  1 possible  placements 


M. 

of 
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the  test  image  window;  we  also  have  (2m  -  Ip  distinct  placements,  including 
non  all-white  patterns.  So  the  prior  probabilities  are  : 


in  f  1 


-7  =  IK  i  =  1.2 . M 


» = ~+1)'-(2,:,-1);  =  i  -  u„ 

( n  -  in  + 1)' 

The  distnbutions  relating  the  observation  y  with  a  specific  state,  i.e.  the 
probability  of  taking  y  as  a  noise  corrupted  version  of  Sj  are  Gaussian: 


H>  ■  y y)  :  N( Si.a-Im2) 

where  Im*  is  the  identity  matrix  of  size  nr  x  nr,  so 

The  detection  of  the  black  square  problem  amounts  to  deciding  for  one  of 

the  M  non  all-white  patterns,  i.e.  one  of  the  states  H,.  i  =  1.2 . M,  or  for 

the  white  pattern,  i.e.  the  state  This  situation  can  directly  be  formulated  into 
a  composite  binary  hypothesis  testing  problem;  for,  we  group  together  the  states 

H"  '  =  L  2 . -U  mt0  one  that  we  will  denote  as  //,.  For  the  resulting  binary 

hypothesis  test  we  will  have  [Poo88,pp43]: 

1.  The  prior  probabilities  :  =  1H)  =  i  _  .Up.  Pl  =  j/p. 

2.  The  distribution  of  the  observed  data  is: 

'■  y  ~  ,/(»(y)  =  /„(y ) 

Hi  :  y  ^ /|  (y)  =  £■[/,(>■)]  =  —  y  /(,vi 


3.  The  cost  function  (arbitrarily  chosen):  ,•(...) 
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4.  The  decision  rule  : 


<h  ri(h  0 


/ 1  (y)  l  -  Mp 
/n(y)  Mp 


which  can  be  reduced  to  : 

M 

(kcidc  0  &  E  o  I  exp  (y7-'.*i)  <  t  . 
;  =  1 

for  constant  a,.  3,.  f. 


The  generalized  likelihood  ratio  list  approach  [Nar89]  utilizes  the  max  operator 
instead  of  the  expectation  operator  E[.]  which  was  used  in  the  previously  presented 
test.  The  resulting  test  in  our  application  is  : 


H{) 

:  y  —  /« (y )  =  /« (y ) 

Hi 

:  y  ~  /l  (y)  =  max/,  (y).  /  G  {1.2.  -  • 

-.M] 

Although  these  tests  guarantee  some  optimality  given  f\  ( ■ ),  they  do  not  guarantee 
any  optimality  given  the  distributions  /,(•)•  >  —  1.2.  -.JI/,  which  are  the 

actually  given  ones.  This  fact  has  led  us  to  the  following  formulation  of  an  M-ary 
hypothesis  testing  problem. 

We  will  not  attempt  to  construct  some  pair  of  distribution  functions  (  /n.  f\  j 
as  before;  instead  we  will  determine  a  cost  function  c(.,.)  acting  directly  on  the 

states H,.  i  =  0. 1 . 1/  ,  which  will  implicitly  do  the  grouping  we  need.  More 

explicitly  let  : 
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l.c.(  1 

M  1 

We  can  now  apply  the  theory  supporting  the  M-ary  test  [Nar89];  this  will 

give  the  following  decision  rule  : 

(]  :  r/(y)  =  i  O  g,(y)  <  (Jj  (Y )  •  0  <  j  <  M.  where 
M 

Cl,  (y)  =  Y, w  k ( r  < k- k’ )]  /*  ( y  >  •  or 

fc  =  0 

**» 

I  pE  .fk  ( y )  •  '  =  ° 

<Ji  =  S  A  =  1 

l  (1  -  Mp)fn(y)-  i  =  1- 2.  •  •  •  -  -A/ 

which  minimizes  the  mean  cost  value  :  J  (d)  =  E  [c(H.d { Y))].  We  note  the 
fact  that  all  y,’s  are  identical  for  i=l,2,...,M,  which  causes  an  ambiguity  in  the 
rule  d.  We  resolve  this  ambiguity  by  defining 

d  :  d  (y  )  —  /  —  min  {A-  |  (y )  <  ( Y 1  •  0  <  ./  b  } 

which  essentially  is  the  intended  binary  decision  rule. 

The  test  can  with  minor  changes  to  the  prior  data  probabilities  be  applied  to 
detecting  a  general  geometric  pattern3  and  not  only  the  full  black  one.  Never¬ 
theless,  it  is  very  expensive  computationally,  as  well  as  the  previously  mentioned 
ones,  since  for  each  check  we  need  to  evaluate  M  inner  products  of  size  nr  ;  we 
3  we  can  give  equations  for  squares  and  triangles  parametrized  over  scale  and  orientation. 
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note  that  such  a  product  evaluation  is  inherit  to  the  template  matching  problem. 
However,  since  we  are  dealing  with  a  specific  template,  the  full  black  one,  we  will 
show  in  chapter  5  that  under  mild  conditions  it  suffices  to  use  the  sum-of-pixels 
statistic  instead  of  the  template-vector  as  the  input  (the  observation)  for  our  test; 
evidently  the  correlation  in  our  case  reduces  to  the  sum  of  the  values  of  all  the 
pixels  in  the  image  window,  since  we  assign  “1”  as  the  value  of  the  noise  free 
black  pixel.  But  note  that  the  equivalence  of  the  correlation  and  the  sum-of-pixels 
statistic  holds  only  when  we  have  the  full  black  or  the  full  white  template. 


2.2  Hypothesis  Testing  Based  on  the  Sum-of-Pixels  Statistic 

As  described  earlier  the  value  (intensity)  of  each  pixel  in  the  noise  corrupted 
image  is  modeled  as  a  random  variable  yJ  =  sJ  +  nJ  where  sJ  takes  values  in 
{0,1}  and  the  n  >'s  are  identical,  independently  distributed  (iid)  random  variables 
such  that  :  vJ  ~  .V  (0.<r-)  .  Let  us  consider  now  the  summation  y  of  all  the 
pixel  values  in  an  image  window.  We  will  have  : 

y  =  i r  +  n.  u'lu  r(  y  =  ^  yJ .  ir  =  sJ  . 

where  Iw  =  index  set  of  the  pixels  of  the  image  test  window,  tr  =  the  Hamming 
weight  or  simply  weight  of  the  image  window  under  question,  and 

n  —  ^  nJ  ~  .V  j  0.  n~  I  =  _Y  (0.  m~rr~ ) 

/6i„  \  ieu  J 

since  ti1' s  are  Gaussian  iid  random  variables.  We  denote  by  «-  ( / )  the  weight  of 
the  i’th  window  pattern,  as  presented  in  the  previous  subsection.  This  parameter 
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will  characterize  the  states  of  the  M-ary  hypothesis  test  we  will  construct  : 


H,  :  a  ~  N  («’(/). w2a2) .  /  =  0, 


(M  as  defined  in  the  previous  subsection).  In  simple  words  different  states  cor¬ 
respond  to  different  levels  of  darkness.  This  property  is  sufficient  to  characterize 
the  all-white  pattern,  i.e.  the  one  we  want  to  single  out  and  can  be  obtained  as 
the  output  of  the  correlating  process  with  the  full  black  template  as  described  at 
the  end  of  the  last  subsection.4 

The  prior  probabilities  of  the  states,  as  well  as  the  cost  function,  will  be 
identical  to  the  ones  found  for  the  M-ary  test  on  the  rough  data.  The  decision 
rule  induced  will  automatically  group  the  states  Hi .  H->.  ■  ■  • .  Hm  into  one  :  H\ . 
So  we  have  to  distinguish  the  two  states  : 

Hi  :  iv  >  0 
Hi  :  ir  —  0 

The  M-ary  hypothesis  framework  [Nar89]  provides  the  rule: 

(I  ;  <](  a  )  =  /<=>  (),  (  IJ )  <  (Jj  ( I '))  .  0  <  ./  <  M  . 

M 

wlifTP  (J,  ( 1/ )  -  y;  Pj  [c  (  ;.  i  )  -  c  {  j.  j  )]  fj  ( </ )  . 

j=« 

which  minimizes  the  mean  cost  E[c(i,j)J.  If  we  use  the  cost  function 


4  Notice  that  w(i)  will  lie  in  the  range  u.  I  .nr  ;  though  as  /  ranges  in  the  set  of  all 
possible  patterns  »  (/)  will  not  take  all  the  values  in  n.  1 .  .  nr  and  also  certain  duplicates 

will  rise  up.  So  the  //,’ s  will  not  all  be  distinct.  This  is  not  a  problem  since  we  just  want  to 
distinguish  //„  from  all  other  states,  which  is  feasible  because  n  ( /)  exists  and  has  no  duplicates. 
In  the  subsection  A.l  we  show  how  we  can  take  advantage  of  the  nature  of  the  «  (  )  function  in 
order  to  reduce  the  complexity  of  our  computations. 
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and  make  a  trivial  ambiguity  waiving  assumption  we  obtain  the  rule  : 

<7  :  (1(y)  =  i  (j,  {tj)  <  <J\-,  (y) . 

(Jj(!,)=  |  P/(5  fl'(y)  ‘  '  =  °  -fh  ( y )  ~  *  (tc  ( /; )  i  ni“0~)  • 

I  (1  -  Mp)fo  ill)  •  /  =  1 

For  all  the  sample  tests  we  study  in  the  present  work  (i.e.  for  various  values 
of  the  image  and  the  square  size  parameters  n  and  m)  this  rule  can  numerically 


be  reduced  to 


cl  :  cl  ( y )  =  0  <=>  y  <  t/o 

for  some  threshold  value  //„  .  This  threshold  determines  the  tolerance  we  can 
have  when  deciding  the  matching  as  a  function  of  the  noise  level  (a1).  In  the 
figure  2.2  we  attempt  to  give  pictorially  the  intuition  of  how  t/0  is  found. 

For  performance  evaluation  we  need  the  following  : 


Probability  of  false  alarm  : 

PJa  =  Pi  {cl  f[)\h=i)}  =  Pr  {//  >  '/ii  |  //n!  =• 

p  -  j  _  d>  f  — )  .  trh<  n 

1  \mn>  v/2rr  ./ 
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Probability  of  detection  : 


P(,  =  Pr\<t±Q\H  ±  0}  =  ]T  Pr  {d^0\H  =  /'}  Pr  {H  =  i  \  H  ±  0} 


.1/  1  .1/ 


I/O  -  »’  ( i 


" ,?’»  §w*  ( 


I/O  -  »’  ( h ) 


where  h  ranges  in  the  set  S  of  all  distinct  non  all-white  patterns  and  pi,  are  the 
prior  probabilities  of  finding  such  a  pattern. 


0  1  yO  2  3  4 

decide  0,  i.e.  non  matching  ^ - H - ►  decide  1,  i.e.  matching 


Figure  2.2  The  decision  regions  for  a  mxm  =  2x2  template  in  the  AWGN  case 
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In  plot  1  the  P,i  i'm  Pftl  curve,  parametrized  by  the  noise  variance  a2,  for 
optimal  Bayesian  tests  is  shown  for  both  the  theoretical  analysis  and  the  simulation 
results.  We  observe  that  for  little  noise  (small  values  of  a2)  Pfa  is  small  and  Pa  is 
high  that  happens  for  small  values  of  the  threshold  yo-  Adding  more  noise  (letting 
a2  get  larger)  causes  Pfa  to  augment,  while  Pa  gets  smaller  and  yo  gets  larger,  yo 
getting  larger  means  that  the  rule  gets  conservative  i.e.  it  needs  more  evidence 
(darkness)  in  order  to  infer  “black  square  detected”.  If  the  process  of  adding  noise 
is  continued  {a2  gets  large  enough)  yo  gets  large;  this  process  gradually  leads  to 
the  rule  “never  decide  black  square  detected”,  i.e.  to  the  point  (0,0)  of  the  plot 
(Pfa,  Pd)-  In  plot  2  the  ROC  of  the  test  for  m=5,  a2=0.1  is  shown.  The  simulation 
results  are  produced  by  using  one  Monte  Carlo  run  and  assume  that  the  optimal 
thresholds  are  known  from  the  theoretical  analysis. 


2.3  Testing  Under  the  Effect  of  BSC  Noise 

Consider  again  the  problem  of  finding  a  black  square  in  a  binary  image 
corrupted  by  noise;  but  now  say  that  the  noise  comes  as  an  effect  of  image 
transmission  over  a  memoryless  binary  symmetric  channel.  The  BSC  noise,  as 
presented  in  the  previous  chapter,  will  result  in  a  distorted  binary  image.  The 
use  of  the  sum-of-pixels  statistic  //  is  equivalent  to  counting  the  black  pixels  on 
the  image  window  we  scan.  A  hypothesis  testing  approach  which  will  give  a 
test  with  the  property  of  minimizing  the  probability  of  an  erroneous  decision  will 
be  described. 
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Similarly  to  the  AWGN  case  we  construct  an  M-ary  test  for  which  the  states 

Htr-  <<'  6  S  U  {{)} 

are  characterized  by  the  Hamming  weight  of  each  possible  mxm  pattern  generated 
by  scanning  the  noise  free  image  ;  specifically  S  is  the  set  of  non  all-white  such 
patterns.  The  prior  probabilities  of  these  states  are  denoted  as  ph  and  they  are 
equal  to  the  ones  computed  in  the  previous  paragraph.  The  cost  function  which 
implicitly  transforms  the  M-ary  hypothesis  into  a  binary  one  again  is  : 


The  two  hypothesis  for  the  binary  test  are  : 

Ho  :  h  =  0 
H\  :  h>  1 

The  decision  rule  will  look  identical  to  the  one  derived  for  the  AWGN  case  : 

(I  :  (I  ( ,, )  =  i  </,  ( // )  <  f/j  (  y ) . 

f  I]  I'hPl,  (//)  -  i  =  0 

<J,  ill)  -  <  /'€-s 

[  /)iiP(,  l//)  .  /  ~  1 

where  P/,  ( // ) .  h  t  5  U  { 0 } ,  correspond  to  the  //,  ( // 1  for  the  AWGN  case,  and  are 
distributions  which  we  still  are  missing  in  order  to  be  able  to  evaluate  our  test. 
Pi,  ( (/ )  is  the  probability  of  observing  an  image  window  with  <j  black  pixels  in  it. 
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while  the  corresponding  noise  free  window  had  h  ones.  As  already  discussed  in 
chapter  1  we  call  these  discretized  probabilities  transition  probabilities.  We  will 
show  now  how  to  evaluate  these  probabilities. 


h-vector 


ph(y) 


o 


m 


y-vector 

1^""-  M  i 

0  in 


h-vector 


k-vector  k  in  [0,h] 


"I  y-vector  y-k  in  [0,m-h] 


Figure  2.3  The  transition  probabilities  for  the  BSC  noise  case 


The  image  window  is  represented  as  a  binary  vector  of  length  ih  =  nr.  The 
shaded  region  in  the  figure  2.3a  represents  l’s  and  the  blanc  region  represents 
0’s.  We  want  to  find  the  probability  of  getting  a  configuration  with  //  l’s  if  our 
vector  with  h  l’s  passes  through  a  BSC  with  bit  inversion  probability  equal  to  e. 
Since  we  are  interested  only  in  the  “area”  of  common  and  non-common  shaded 
places  between  the  h —  and  // — vectors  and  not  the  specific  positions  of  black  and 
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white  pixels  we  will  assume  that  all  the  shaded  area  in  the  h — vector  is  stacked 
on  the  left. 

The  key  idea  is  as  follows  :  The  probability  of  getting  a  ;/ — vector  from  an 
h — vector  is  equal  to  the  common  probability  of  getting  a  k — vector  from  the 
h — vector,  where  k  <  h.y  is  the  number  of  common  shaded  places  between 
the  h —  and  y — vectors  and  getting  a  y — vector  from  the  k — vector,  where  y  —  k 
shaded  places  are  not  common  with  the  h — vector.  The  2-step  process  is  depicted 
on  the  figure  2.3b 

Since  the  two  events  a  and  b  (see  fig.  2.4 .b)  are  related  with  uncommon  areas 
(different  bits)  and  the  binary  noise  is  white  they  are  independent,  i.e. 

P,ib  ( k )  =  P„  (k)  ■  Pi,(k) 

We  have  : 

P„  {!■■)= 
niAi  = 

with  the  constraints  : 

0  <k<h 

0  £  .'/  —  k  <  ill  —  h  ^  h  —  m  <  k  —  y  <()  It  +  y  —  in  <  k  <  y 
k  €  [max(0.  It  +  y  -  ih  ) .  min  ( h.  y )]  =  [A,,,,,,.  AIliax] 

So 

*’n. 

Ph  <.'/)=  P" 1 1 1  P,‘ ( 

l'  — -  *  n. in 

mm(  h .(/ ) 

Pi,  (.'/)=  X] 

I  =un\l U./»4  11—  in  I 
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If  we  substitute  this  expression  of  P/,  ( y )  in  the  decision  rule  then  this 


becomes: 


^  A'm  ax  /  ^ 

(I  :  d  ( y )  =  0  <=>  —  ^  Pi,  Y'  (  , 

;>()  “  ,rT  V*1 

IlGiS  k — A  nun 

and  by  a  further  numerical  simplification  : 


)/!  —  /l 

V  -  k 


X  A-2A  /  . 

rir  -C 


d  :  d  ( y )  =  0  O  y  <  yo  • 


where  y(l  is  some  threshold  in  the  range  from  0  to  m. 
For  the  performance  evaluation  of  the  test  we  need: 


Probability  of  false  alarm  : 

Pfa  =  Pr  {d  ±  0  |  H  =  0}  =  Pr  {y  >  y0  |  H0} 


Pf, 


E 

,V=.Vo+l 


'  1  —  f)m~y  ev 


Probability  of  detection  : 

P(l  =  Pr  {d?0\H?  0}  =  Y,  Pr  i(l  ±  0  I  H  =  /'}  Pr  {H  =  I  #  ?  °)  = 


1  -  Pr  {y  <  y«  I  P  =  /»} 
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The  P,i  c>  Pf„  curve  for  optimal  Bayesian  rules  parametrized  by  the  pixel 
inversion  probability  f  is  given  in  plot  3  for  the  case  of  m=8.  The  shape  of  this 
curve  is  similar  to  the  one  of  plot  1,  in  the  sense  that  for  little  noise  we  have 
small  Pfa  and  high  Pd,  while  if  we  add  noise  we  gradually  move  towards  the 
point  (Pfa=0,  Pd=0)-  The  optimal  Bayesian  rules  range  from  “decide  black  square 
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detected  if  sum-of-pixels  >  5”  for  f=0.02  to  “decide  black  square  detected  if  sum- 
of-pixels  >  37”  for  f=0.4  and  effectively  “never  decide  black  square  detected” 
(i.e.  reach  the  point  (0,0)  of  the  curve  (Pfa  ,Pd) )  for  even  larger  amounts  of  noise. 
The  stairs-like  shape  of  the  curve  is  due  to  the  discrete  nature  of  the  threshold  and 
makes  apparent  the  trade-off  between  the  high  Pd  and  low  Pfa  that  the  optimal 
Bayesian  rule  attempts  to  compensate.  In  plot  4  we  see  the  ROC  of  the  test 
produced  as  a  result  of  the  above  analysis  for  the  case  of  m=8  and  e=0.1.  The 
ROC  found  by  simulating  is  also  given;  again  one  Monte  Carlo  run  is  used,  and 
optimal  threshold  values  are  assumed  to  be  known  from  the  numerical  analysis. 


2.4  Testing  Based  on  Image  Data  of  a  Coarsened  Resolution 

Consider  a  binary  image  corrupted  by  BSC  noise  with  parameter  e.  We 
can  code  each  2  x  2-pixel  block  of  it  with  one  bit  indicating  all-black  or  all- 
white  block,  depending  on  the  bit  value.  The  noise  of  the  original  image  will 
“propagate”  to  the  coded  data,  resulting  in  bit  inversions.  The  white  noise  in  the 
original  data  will  cause  the  noise  to  be  white  in  the  coded  data  too. 


Figure  2.4  The  Binary  Channel  transition  diagram 
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The  relation  of  the  noise-free-coded  data  and  the  noise-corrupted-and-then- 
coded  data  is  depicted  in  the  figure  2.4.  The  inversion  probabilities  fo  and  ei 
depend  on  the  specific  coding  rule.  If  the  coding  (quantization)  rule  is  the  one 
given  in  table  1.2  (subsection  1.3),  then  because  of  the  symmetry  of  the  rule  we 
will  have  6o-<T=<=-  So  the  noise  on  the  coded  data  can  be  modeled  again  as  BSC 
noise  with  inversion  probability  e,  which  depends  on  the  noise  parameter  e  of  the 
original  data  as  follows  : 

f  =  Pr  {have  wore  them  2  bit  inversions}  +  ^ Pr  {have  2  bit  inversions} 


;=:!  v 


1  /  4 


f(l-f)-  . 


An  analysis  for  the  detection  of  the  black  square  based  on  these  coded  data  is 
identical  to  the  one  we  made  in  the  last  subsection.  The  performance  evaluation 
curves  for  this  test  are  given  in  plots  5  and  6.  We  observe  that  they  have  the  same 
shape  as  the  plots  3  and  4  respectively  which  were  drawn  for  the  uncoded  data. 
However  note  that  the  ROC  curve  for  e=0.1  is  closer  to  the  point  (Pfa=0,  Pd=l); 
this  means  that  tests  acting  on  data  of  coarsened  resolution  are  more  reliable  (for 
this  specific  object  target  and  amount  of  noise)  than  these  acting  on  the  original 
pixel  data.  Similar  observations  are  further  discussed  in  the  subsection  5.1. 


2.5  Summary 

In  this  chapter  we  described  a  simple  instance  of  the  template  matching 
problem  using  noise  corrupted  data.  Initially  we  considered  Gaussian  noise  and 
tried  to  find  some  optimal  matching  test.  We  briefly  examined  three  tests  acting  on 
the  rough  data,  starting  from  the  most  straightforward  approach  of  the  composite 
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binary  hypothesis  test,  continuing  with  the  generalized  likelihood  ratio  list  test 
and  reaching  the  most  structured  M-ary  hypothesis  test. 

We  noticed  that  for  the  case  of  the  full  black  template  we  can  avoid  the  com¬ 
putational  load  of  inner  products,  or  equivalently  of  convolutions,  and  introduced 
the  sum-of-pixels  statistic.  We  developed  and  evaluated  a  binary  hypothesis  test 
“residing”  on  an  M-ary  test,  which  was  using  the  sum-of-pixels  statistic  as  its  ob¬ 
servation.  This  test  designed  for  the  AWGN  case  was  properly  modified  for  the 
case  of  the  BSC  noise  and  further  modification  to  the  latter  gave  a  test  based  on 
coarser  resolution  image  data.  Sample  simulation  results  were  given  for  both  tests. 

An  attempt  was  made  to  show  the  steps  followed  which  led  to  the  specific, 
described  formulation  of  the  BSC  noise  case  of  the  problem.  This  formulation 
will  be  expanded  in  the  fourth  chapter  in  order  to  implement  a  feature  extraction 
algorithm  applicable  to  either  multiple-gray  level  images  or  to  block  coded  image 
data. 
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Chapter  3 
Image  Coding 

3.1  Block  Coding  Techniques 

A  block  coding  technique  transforms  the  nxn-pixel  valued  matrix  representing 
an  image  into  a  c„  x  c„  -block  code  valued  matrix  (c„  =  n /"block  size”), 
representing  the  encoded  version  of  the  original  image;  each  block  is  encoded 
separately  from  all  the  others;  we  consider  multiple  level  gray  images.  The  coding 
process  may  be  analyzed  into  three  steps,  namely  preprocessing,  quantization 
and  the  coding  itself,  as  shown  in  the  table  3.1  for  three  different  block  coding 

schemes. 


scheme  input  preprocessing  quantization  source  coding  output 


1 

whitening 
(KLT,  DCT, 
Hadamard  basis) 

scalar  quant. 
(Max -Loyd  qu.) 

Huffman 

coding 

or 

enumeration 

block  image 

(CnxCn' 
block  code 

2 

pixel  image 

(nxn-pixel 

valued 

matrix) 

normalization 

vector  quant. 

3 

- 

modified 
Hadamard  basis 

enumeration 

valued  matrix) 

Table  3.1  Block  quantization  /  coding  schemes  applied  to  image  data 


For  the  schemes  1  and  2  the  pixel  values  in  a  block  form  a  vector  of 
random  variables;  the  randomness  is  due  to  the  image  statistics  and  assumed  to  be 
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sufficiently  described  by  the  first  and  second  order  histograms.  In  [HuSc63]  the 
block-vector,  composed  of  4  Markovian  random  variables,  is  whitened  by  using 
the  KLT  (see  paragraph  1.4,  also  [NeLi80],[San89])  and  then  each  component 
is  separately  quantized  by  the  Max-Loyd  quantizer  [Max60].  It  turns  out  that  8 
bits  are  adequate  for  representing  a  quantized  vector.  In  [LaS171]  the  4x4-pixel 
blocks  are  coded  by  the  Hadamard  basis  (see  par.  1.4).  It  is  shown  that  32  bits 
are  adequate  for  representing  a  quantized  block.  Observe  that  both  techniques 
require  a  rate  of  2  bpp  (bits  per  pixel). 

Vector  Quantization  (VQ)  [NaKi88]  can  give  better  compression  ratios  for  a 
given  performance  than  the  scalar  quantization.  This  is  an  information  theoretic 
result  broadly  used  in  several  applications  in  the  last  few  years.  In  [MAY82] 
VQ  is  used  for  the  image  block-vectors  while  their  components  are  normalized 
over  their  variances.  We  should  note  that  the  VQ  procedure  has  some  underlying 
distortion  measure,  applied  on  pairs  of  quantized  vectors;  the  selection  of  this 
distortion  measure  is  application  dependent  [MRG85]  and  determines  the  VQ 
performance  to  a  great  extent. 

Both  of  the  above  schemes  provide  us  with  a  block  alphabet  of  quantized 
vectors  to  which  a  code  must  be  assigned.  This  code  may  amount  to  simply 
enumerating  the  codewords  or  using  a  Huffman  code,  so  that  the  information 
theoretic  properties  of  the  alphabet  may  be  exploited.  However  as  pointed  out 
in  chapter  1  we  still  require  the  same  number  of  bits  for  coding  each  quantized 
block-vector  and  therefore  we  should  use  simple  enumeration  in  our  application. 

The  scheme  3  of  the  table  3.1  is  described  in  subsection  3.3  while  the 
reasoning  that  led  us  to  admit  it  is  given  in  the  following  paragraph. 
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3.2  Constraints  on  the  Block  Alphabet  Size  Due 
to  the  Matching  Test 

Let  /  be  the  number  of  bits  assigned  to  each  codeword  (block)  of  the  image 
block-alphabet.  The  size  of  the  alphabet  will  be  M  —  2l.  We  usually  have 

2 1  <<  g*  =  #  of  possible  lion  quantize d  blocks . 


where  </  is  the  number  of  gray  levels  in  the  original  image  and  $  is  the  size  of  the 
blocks  in  pixels.  So  /  parametrizes  the  quantization  level  of  the  original  image 
and  therefore  affects  the  quality  of  the  image. 

Suppose  now  that  the  original  image  has  been  corrupted  by  a  noise  charac¬ 
terized  by  some  parameter,  e.g.  c  or  rr~  as  we  have  seen  in  the  previous  chapters. 
Consequently  we  erroneously  may  take  a  codeword  i  to  be  some  other,  say  j. 
We  remind  the  reader  that  we  call  these  probabilities  (tJ  transition  probabilities. 
Evidently  the  noise  parameter  (e)  affects  the  quality  of  the  reconstructed  image 
as  well. 

As  already  discussed  we  intend  to  apply  some  feature  extraction  algorithm 
(template  matching)  onto  the  coded  data;  we  will  obtain  a  performance  evaluation 
measure  (Pj„.  P,i)  which  will  be  interpreted  as  the  image  quality  index;  thereafter 
we  will  exploit  the  effects  of  the  two  noise  sources  (having  the  parameters  /  and  c ) 


qua  nil  nation 
<  .eft  rnal  mux 


I  ~  iPi„.P,i) 


onto  the  (reconstructed)  image  quality.  We  will  do  so  by  formulating  the  template 
matching  as  a  hypothesis  testing  problem.  In  the  next  chapter  it  is  shown  that  if  the 
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template  is  of  size  n  blocks,  then  the  complexity  of  the  test  will  be  parametrized 
over  the  template  codebook  size 


We  observe  that  : 

1.  The  coding  schemes  presented  in  the  previous  paragraph  will  give  rise  to  a 
very  large  S(M,n)  parameter  to  allow  a  test  to  be  implemented. 

2.  We  cannot  afford  /> 4  (i.e.  M>16)  and  n>9,  since  S  (16. 9)  «  1.3  x  10(>. 

Note  that  in  the  numerical  analysis  implemented  on  a  Sun  work  station  the  values 
of  the  parameters  are  M,  n=4;  so  S(M,n)=3876.  If  we  want  to  have  a  reasonable 
size  for  our  template,  e.g.  12xl2-pixel  or  equivalently  3x3-block  of4x4-pixel 
blocks  we  effectively  need  to  restrict  ourselves  to  the  case  of  the  binary  image. 

A  quite  different  approach  for  solving  the  “image  evaluation”  problem  as  it 
has  so  far  been  formulated,  will  be  mentioned,  even  though  it  will  not  be  examined 
in  the  present  work.  This  comes  from  the  VQ  technique  using  the  Itakura-Saito 
measure  [MRG85];  the  performance  evaluation  of  the  matching  test  can  then  be 
expressed  through  this  distortion  measure  and  thus  help  us  avoid  the  formulation 
of  the  hypothesis  testing  problem. 


5  Nevertheless  it  is  worthwhile  to: 


a. 


b. 


Try  to  adapt  the  coding  scheme  of  [TaFa86]  to  our  problem  since  it  furnishes  rates  up 
to  0.25  bpp  and  it  is  robust  to  the  noise. 

Exploit  the  fact  that  most  .  ’s  equal  0  and  thus  the  complexity  may  be  reduced  from 
the  order  of  the  codebook  size  to  the  size  of  a  “small’-  hypercube  around  the  point  of 

interest. 
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3.3  A  modified  Hadamard  Basis 


The  available  4-bit  codeword  for  a  block  can  represent  2 5  =  1G  different 
blocks.  We  assign  all  bits  to  one  variable  which  is  intended  to  map  the  16  “most 
possible”  4x4  blocks. 


a.  binary  pattern  b.  relative  variance  (^) 

c.  prior  probability  estimation  (pi) 


a 


b 

C 

0.3594 

0.0352 

0.087 

0.0313 

0.035 

0.0126 

Figure  3.1  The  modified  4x4  Hadamard  basis 


How  to  determine  the  16  patterns  :  In  [LaS171]  the  Hadamard  basis  for 
4x4—pixel  blocks  is  given  (see  also  fig.  1.5),  along  with  a  measure  of  variance 
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of  the  random  variable  corresponding  to  each  element  i.  We  will  use  this  measure 
for  making  an  estimation  of  the  prior  probabilities  of  having  these  blocks.  The 
Hadamard  basis  is  used  for  multiple  gray  level  images;  negative  intensity  factors 
can  reverse  the  pattern,  e.g.  a  black  block  may  be  converted  into  a  white  one.  In 
our  application  we  do  not  use  such  a  factor;  instead  we  use  the  first  8  patterns  of  the 
Hadamard  basis  plus  their  reversed  versions.  We  will  take  the  prior  probabilities 
of  them  to  be 


6/2  . 

P>  I  =  P.'i-t-i  =  — - .  '  =  0, 1, 

tli 

j= " 


Table  3.2  The  Hamming  distances  among  the  elements  of  the  modified  Hadamard  basis 
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The  16  patterns  we  will  use  and  their  prior  probabilities  are  shown  in  the 
figure  3.1.  The  Hamming  distances  between  pairs  of  block  patterns  are  shown 
on  the  table  3.2 

Transitions  are  caused  by  the  white  binary  noise  corruption.  Nevertheless  they 
are  governed  by  the  quantization  procedure,  i.e.  the  rule  used  to  map  the  noise 
corrupted  data  back  to  the  limited  alphabet  of  the  16  blocks.  These  transitions 
are  discussed  in  the  next  paragraph. 


3.4  The  Quantization/Coding  Procedure 

We  will  start  by  making  some  observations  :  First  the  Hamming  distance  d(i,j) 
between  two  different  codewords  i,j  is  at  least  8  (see  table  3.2);  therefore  we  may 
think  of  the  codewords  as  the  centers  of  spheres  in  the  { 0. 1 } 1 6  -space  with  radius 
half  the  Hamming  distance  p=4.  Secondly,  not  all  codewords  comming  up  from 
the  Hadamard  basis  are  used;  actually  exactly  half  of  them  are  considered  in  our 
coding  scheme. 

The  quantization  rule  : 

For  each  4x4~pixel  block  (call  it  y)  of  the  noise  corrupted  image  do: 

1.  if  d(y,i)  <  4  for  some  codeword  i  then  decode  y— >i 

2.  if  d(y,i)  >  4  for  all  i 

a.  if  “weight  of  y”  (w(y))  >  8  decode  y— 0  (full  black  block) 

b.  if  w(y)  <  8  decode  y— 1  (full  white  block) 
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c.  if  w(y)  =  8, 


decode  y— >0  with  probability  0.5, 
decode  y— *1  with  probability  0.5 

3.  if  d(y4)  =  4  for  one  or  more  i’s  then 

decode  y  where  =  mg  max  {prior  (j )} . 

j 

What  are  the  transition  probabilities  : 

We  assume  that  the  original  scene  of  the  image  can  accurately  be  represented 
by  the  block  alphabet  we  have,  i.e.  the  image  can  be  decomposed  into  blocks 
each  of  which  represents  an  alphabet  element,  say  i.  The  image  is  corrupted  by 
binary  noise  with  parameter  f  ;  so  the  element  i  will  be  translated  into  a  4x4 
binary  block  (call  it  y)  which  may  not  be  an  alphabet  element.  Nevertheless, 
the  quantization  rule  given  above  will  decode  it  into  some  alphabet  element,  say 
j,  which  may  be  different  from  i.  Once  more  we  repeat  that  we  call  transition 
probability  Pr  { i — >j}  =  P(  j  /  i  )  =  Pr{d=j  /  H=i}  the  probability  of  deciding  that  the 
element  (codeword)  j  appears  in  some  place  of  the  image,  given  that  the  element 
(codeword)  i  was  at  that  place  of  the  original  image.  We  have  to  calculate  the 
probabilities  P(i  )_/').  i.j  =  0. 1. . 13  . 

We  have  : 

Pr  {(I  =  j  |  H  —  / )  -  Y  Pr  j <)/»■’<  r r t  //  |  H  =  / }  (  1  ) 

i/t- 1 ! , 

where  B ,  is  the  decision  region  for  the  codeword  j;  note  that 

Pr{Y  =  //  |  H  -  /  }  =  ^'""(l  - 
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The  decision  regions,  according  to  the  quantization  rule  are  : 

ftu  =  {//  :  (d(y.Q)  <  4)  V  condO  [y)}  (2) 

rondO  ( y )  =  (  ((»•(//)>  S)  V  ( w  ( //)  =  Sir.p.l/2))  A  (d  (  </,  / )  >  4.  /  =  0,  •  •  • ,  15)  ) 


ft  I  =  {//  :  (</(  y.  1 )  <  4)  V  rom/1  ( y )}  (3) 

rom/1  (y  )  =  (  ( (  »•(«/)  <  8)  V  ( ir  (  y)  =  Sir.p.l/2))  A  (d{y.  i)  >  4.  i  =  0,  •  •  • ,  15)  ) 

-ft,  =  I  «/  :  ( <1  ( y.  0  <  3 )  V  ( d ( y.  i )  =  4 )  A  ( /'  =  arg  max  prior  ( j ) )  1  ( 4 ) 

l  ./€/(.«)  J 

where  i  =  2.  3.  •  •  • .  15  om/  /(y)  =  {./  :  d(y,j  )  =  4} 

For  j  =  0  : 

Since  d(y. 0)  <  4  ic(j/)  >  12  we  get  : 

(1).(2)  ftr(0  |  i)  =  ]jT  ft?'{.!/l/}+  X]  ft,’{</IO 

y:ir{y)>12  u  <t\y,j)> 4.j=o.-  .15 

u(t/)>8  or  ui  17  )s8u>. p.1/2 

=  Pr{y  :  «•({/)  >  12  I  /}  +  Pr  {y  :  el(y.j)  >  4,  j  =  0.  •  •  • ,  15  |  i}  x 


|  Pr  {"'(y)  >  S  |  /'}  +  \Pr  {ic{y)  =  8  |  /'} 

For  the  first  term  of  (5)  we  have  : 


(5) 


16 


Pr  {//  :  «•(//)  >  12  |  /}  =  Pr  {wii)-*  w.  ic  =  12.  •  •  • .  16}  =  ^  PM.(|)  (./') 

;=l- 

and 

P„.{i)  (./ )  =  Pr{ejet  ei  1G  —  hit  length  word  with  j  lft 

out  of  et  mi  we  length  word  with  w(i)  l'.s} 
mi"{  «•(!»../}  ,  ,-i\  /ir  n\ 

_  (  "  (/  10  (  (i  -  »>+-'*  fJ+<rii)-2k 


k =max  { ().«■ •(/)+/  —  10} 

If  i  =  2,3,..., 15  then  : 


h 


j  -  * 


ft ’„•(,)  (.ft  —  fts  (.ft  —  ^ 


t  =  i 

id  -  f)f]S 


1  -  f 


.S 

7  -  /■ 

N 

V 

/ 

*■  =  7-“ 


./  =  12.  •  ■  • .  1G 


1  -  r  )S~'+J/'  f 


a  /  v ./  -  a  y  v  i  —  r 


-n 
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If  i  =  0  then  : 


r«,)U)  =  Pnij)  =  Y.{  /, 

10V-^ 

J 


1G\  /  0  ^  ( 1  _  fi <>—^+7 

j  -  l: 


J  ,16— j 


If  i  =  1  then  : 


(j)  =  po(j)  =  5Z 

A=o 


16 
j  - 


1_  f)1H+3*^ 


Now  we  will  evaluate  the  second  term  of  (5).  Let’s  return  to  the  spheres 
concept.  The  16-element  original  Hadamard  basis  along  with  the  “complements 
of  these  elements  may  be  thought  as  the  centers  of  spheres  which  constitute  a 
32-partition  of  the  {0. 1 } 10  space  in  which  y  lies.  If  i  is  the  element  we  originally 
have  and  if  due  to  the  noise  corruption  it  is  translated  into  a  y  such  that  d(i,y) 
>  4,  y  will  he  with  the  same  probability  in  any  of  the  rest  32-2  =  30  regions  of 
the  partition.  Note  that  we  exclude  the  complementary  block  which  is  unlikely 
to  occur.  If  y  lies  in  one  of  the  16-2  =  14  left  regions  with  center  one  of  the 
alphabet  elements  we  use,  then  we  decode  y  as  being  this  element.  If  not  we 
decode  it  into  “0”  or  “1”  depending  on  its  weight  w(y).  Therefore  we  have  : 


Pr  {(/  :  </(</../)  >  4.  j  =().■•  .  10  M  = 

Pr  {])  :  </(//./)  >  4 }  Pr  {//  :»/(«/../)  •  4.  /  -  0.  •  • 


.  10  I  <1  [  a.  1 1  ">  4.  i }  — 


=  "OS 
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Also 


Hi 


Pr{w(y)>  8  |0} 


1G 

j 


fi6-J(  1  - 


Pr  {w  ( y)  =  S  |  0}  =  fK  ( 1  —  f )  —  Pr  {«’  ( y )  —  8  |  1 } 


10 


Pr  { w  ( y )  >  S  |  1 }  =  V'  1  f*  ( 1  —  f ) 


>16-7 


onr/  by  symmetry 

Pr{w(y)  >  8  |  ?}  +  ^Pr  {«’({/)  =  8  |  ?}  =  -.  i  =  2.  •  •  • .  15 

By  putting  them  all  together  now  we  get  : 


P(0|o>=  ]T  (16)  (i - J 

j= 12  '  1  ' 


+ 


14 

30 


If  16 
2  V  S 


Ki 


7=f  '  J  ' 


16' 


16 

f/=5 


frfd  -p) 


16— f/ 


16 


P(0|1)=  J] 

7  =  12 


(1  -  f)10"-'  + 


14 

30 


1  flG 

2  V  S 


16 


(1-0%"  +  ^  f'(l-6 


\  7  / 


16 

E 


erf(l  -  e)10-'' 


P (0  I  i)  =  [(1  -  Of] 


16 


7  =  1- 


14  1 


16 


30 

For  j  =  1  : 


d=r, 


->k 


+ 


i  =  2. 3.  •••.15 


(S) 


Because  of  symmetry  we  have  : 

P  ( 1  |  1)  =  PH)  |  in 


P(1  |  0)  =  PHI  |  li 

P(1  |  /)  =  P(0  |  / ) .  /'  =  2.3.  ••• .  15 


(G) 
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For  j  =  2,3,  ...»  15  : 


Pi.i  I  0 


£  P{!j\i}  + 
»:</(«.  »)<:* 


P{.</  I  '} 


1/  l/<  </.»)  =  4  • 

i zs arp  Iiiax  /»>  tnr(j  ),j£P  •/ ) 


<  ]T  p  {// 1  /}  =  -P»-  {.</ :  <  4  I  /}  • 

y:(Hu-i)<  4 

ichn'f  I(y)  -  {./  •  ^ =  4} 


The  bound  above  is  adequate  for  our  needs  since  the  transition  probabilities 
are  treated  as  error  probabilities.  Therefore  from  now  on  we  will  approximate 
the  transition  probability  with  its  corresponding  bound.  We  expect  that  this 
approximation  will  lead  to  slightly  more  pessimistic  results  than  the  actual  ones 
we  would  obtain  via  simulations. 


Let’s  now  call  i  the  “complementary  element”  of  i.  Then  : 

pr  {j.j  ±  /  |  0  =  Pr  {;/  :  </(.«/../ )  <  4  |  L<Hy.i)  >  4}  x  Pr  {c!(,jJ)  >  4}  = 
Pr  {fi  e  in  30  equaly  likdy  rtyions}  x  Pr  {</(.«/•')  >  4}  =4* 


,  i  4^  /1G 

^■U'M=3oL 


k= 5 


A 


f*  <1  -  f  •  '-J  =  2.  •  •  ■  •  lo.  j  ±  ij  +  > 


(9) 


Finally  we  have  : 


Pr  0  |  0  =  P''  {</  :  d  )!!■')  >  121  - 

A  =  1  _>  v 

(//>(> 

I 

=  y'  /  1  ,  1  .  ;  =  o  . . 


7V(/  /) 


k= (I 


4 


15 


.15 


(10) 


(11) 


A  summary  of  the  transition  probabilities  is  given  in  the  table  3.3. 
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P(j|i> 

given  by  formula 


0 

1 

0 

1 

(6) 

0 

1 

1 

0 

(7) 

2,.  ...15 

i 

0,1 

(8) 

2,..., 15 

2,...,15,j^r  i,  1 

(9) 

2,..., 15 

i 

(11) 

2,... ,15 

i 

(10) 

Table  3.3  Block  transition  probabilities  summary 


3.5  Summary 

In  this  chapter  we  first  saw  how  a  matrix  representing  a  multiple  gray  level 
pixel  image  can  be  converted  into  one  representing  a  block  coded  image,  so  that 
the  image  can  efficiently  be  stored  and  transmitted.  We  recalled  that  template 
matching  can  be  considered  as  a  quality  measure  for  the  reconstructed  image. 
Based  on  a  result  presented  in  the  next  chapter  we  showed  that  in  order  to  be 
able  to  implement  such  a  matching  test  acting  on  coded  image  data  we  need  to 
compress  the  image  more  than  the  storage  and  transmission  schemes  allow;  this 
effectively  leads  to  some  restricted  class  of  images  (binary  images  representing 
objects  which  can  be  described  by  the  modified  Hadamard  basis).  We  developed 
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a  coding  scheme  meeting  our  needs  and  calculated  its  statistical  properties  (block 
priors  and  transitions)  in  terms  of  the  noise  characteristics. 
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Chapter  4 

The  Histogram  Matching 
Problem 

4.1  The  Template  Histogram 

In  the  previous  chapter  we  introduced  a  block  coding  scheme  transforming  a 
binary  image  into  a  matrix  of  codewords  in  the  range  0, ...,  15;  we  also  computed 
the  transition  probabilities  among  the  blocks  in  terms  of  the  probability  of  bit 
inversion  in  the  original  image  which  also  was  the  cause  of  these  transitions. 
In  this  way  a  12  x  12-pixel  image  window  is  transformed  into  a  3  x  3-code  sub- 
matrix,  since  4  x  4-pixel  blocks  were  considered.  The  9  elements  of  this  sub- 
matrix  are  independent  random  variables,  because  the  BSC  noise  was  supposed 
to  be  white  (in  space)  and  their  values  define  a  configuration  which  will  be  called 
the  original  state.  If  we  now  group  together  all  the  original  states  having  the 
same  (1st  order)  histogram  (i.e.  all  sub-matrices  having  the  same  multitude  of 
each  element)  then  we  come  up  with  a  new  set  of  states.  We  will  call  this  new 
set  the  histogram  alphabet.  Evidently  the  size  of  this  set  is  much  smaller  than 
the  one  of  the  original  set;  the  size  of  the  histogram  alphabet  will  be  discussed 
in  the  next  chapter. 

Let  M  be  the  number  of  all  possible  distinct  components  (block  codes)  which 
may  constitute  the  template.  Let  also  n  be  the  length  of  the  template  in  blocks  (in 
the  example  above  we  had  n  =  9  =  3x3  blocks).  We  will  denote  by  S(M.n)  the 
size  of  the  histogram  alphabet.  Call  h  the  M-vector  representing  the  histogram 
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of  the  noise-free  coded  template  of  size  n;  call  also  y  the  M-vector  representing 
the  histogram  of  a  noise  corrupted  and  afterwards  coded  image  window  of  size 
n.  The  vector  h  represents  some  feature  we  want  to  detect  on  the  image;  our 
problem  amounts  to  comparing  the  two  histograms  h  and  y  and  infer  a  decision 
about  their  matching. 

We  will  formulate  this  matching  as  an  M-ary  hypothesis  testing  problem  which 
will  efficiently  lead  to  an  optimal  binary  decision  rule.  To  do  so  we  will  assume  we 
are  given  the  prior  probabilities  and  the  transition  probabilities  of  the  histogram 
elements  (i.e.  the  block  priors  p,.  i  =  0. 1.  •  •  • .  M  —  1  and  block  transitions 
i.  j  =  0. 1.  •  •  • .  M  -  1)  so  that  we  will  be  able  to  find  the  histogram  priors 
and  histogram  transitions. 

Observe  that  if  we  set  M  =  2,  i.e.  if  we  have  a  binary  code,  then  the 
histogram  reduces  to  the  “sum-of-pixels”  statistic  we  discussed  in  chapter  2.  In  the 
subsection  4.2  we  develop  a  rule  which  is  a  generalization  of  the  one  developed 
in  chapter  2.  This  rule  can  be  applied  to  compare  arbitrary  histograms,  provided 
we  have  the  information  about  the  histogram  elements  mentioned  above. 

4.2  The  Histogram  Matching  as  an  M-ary  Hypothesis  Test 

4.2.1  The  Histogram  Alphabet 

Let  {(»„.«,.•  ••-<»  !/_, }  be  the  block  alphabet.  Evidently  we  will  have  M" 
possible  n-tuples  of  blocks.  We  introduce  an  equivalence  relation  onto  this  set  of 
//-tuples.  An  equivalence  class  will  be  composed  of  all  //-tuples  having  the  same 
histograms,  i.e.  two  //-tuples  are  in  the  same  class  if  one  is  a  permutation  of  the 
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elements  of  the  other.  A  representative  element  of  an  equivalence  class  will  be 
called  a  histogram  pattern. 

Consider  now  the  following  power  expansion  [Spi68]  : 


(<vi>  +  o]  -+-•••+  )  — 


E 


II  o  II  i 

7°u  °i 


Observe  that  : 

1.  Each  term  in  the  summation  above  can  be  considered  as  a  histogram  pattern, 


i.e. 


1 


•  Cl 


H  A/  — 1 

M- 1 


represents  an  n-tuple  in  which  we  have  ??o  times  the  element  oq,  n\  times 
the  element  cn  and  so  on;  note  that  if  n,  =  0  for  some  i,  then  cv,  is  not  in 
the  n-tuple. 

2.  The  size  of  each  equivalence  class  will  be  equal  to 

7?! 

77(|!??l!  •  •  -  7?M_l! 

3.  If  pi  is  the  prior  probability  for  cv; ,  ?  =  0, 1 .  •  •  • ,  M  —  1 ,  then  the  independence 
assumption  implies  that  the  prior  probability  for  a  specific  element  in  the 
equivalence  class  will  be 


I)  n  lit  *  M  /  —  l 

I'll  /’ I  '  "I'M- I 


while  the  prior  for  the  class  itself,  i.e.  the  prior  of  the  histogram  pattern  will 
be 
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4.  We  can  systematically  construct  the  histogram  patterns  with  the  following 


procedure  : 

a.  Find  all  integer  partitions  of  the  number  n  in  at  most  M  places  (see 
subsection  A.2) 


{//o.  ni,  •  •  •  ,Njt/- 1 


} 


b.  For  each  such  partition  find  all  the  distinct  permutations  of  »i;’s  in  M 

places.  Note  that  we  have 

f'M  \  (M  -n0\ 

\»0 /  \  »1  )  \”M- \) 


of  them. 

Each  permutation  will  determine  a  set  of  exponents  in  the  expansion  formula 
and  therefore  a  histogram  pattern. 

5.  The  number  of  histogram  patterns,  i.e.  the  size  of  the  histogram  alphabet, 
S(M,n)  will  be  equal  to  the  number  of  summands  and  therefore 

fM  +  n  -  1\ 

sw-»)={  v  )  • 

Note  that  in  the  case  of  the  binary  block  alphabet  (e.g.  black/white)  we  get 
S(2,n)  =  n+1  =  the  number  of  possible  variations  of  black -white  mixtures  in 

a  pattern  of  size  n. 

In  the  table  4.1  we  see  the  integer  partitions  of  the  number  n  =  9.  the  number 
of  classes  generated  and  the  size  of  each  class  (number  of  original  states  grouped 
together)  for  the  special  case  of  M  —  16. 
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The  equivalence  classes  described  (histogram  patterns)  will  determine  the 
states  for  the  M-ary  test  we  will  develop;  the  state  space  will  be  denoted  by  H. 


integer 

partition 

#  of 

classes 

generated 

size  of 

class 

integer 

partition 

#of 

classes 

generated 

size  of 

class 

54 

240 

126 

33111 

20160 

531 

3360 

504 

3111111 

■bbsh 

60480 

522 

1680 

756 

222111 

160160 

45360 

5211 

21840 

1512 

2211111 

240240 

90720 

51111 

21840 

3024 

21111111 

102960 

181440 

441 

1680 

630 

111111111 

11440 

362880 

4311 

21840 

2520 

22221 

21840 

22680 

4221 

21840 

3780 

63 

240 

84 

42111 

21840 

7560 

621 

3360 

252 

411111 

48048 

15120 

6111 

7280 

504 

432 

3360 

1260 

72 

240 

36 

3312 

21840 

50  40 

711 

1680 

72 

3222 

1680 

7560 

81 

240 

9 

32211 

131040 

15120 

9 

16 

1 

321111 

240240 

30240 

Table  4.1  The  histogram  pattern  classes 


4.2.2  The  Observation  and  the  Cost  Function 

The  observation  data  in  our  test  correspond  to  some  n-vector  of  block  code¬ 
words.  This  information  can  equivalently  be  represented  by  an  M-vector  y  being 
the  histogram  pattern  of  the  image  window  we  scan.  For  example  if  n  =  4  and  M 
=  16  we  may  have  the  observation  [1  5  15  1]  which  is  equivalent  to  the  histogram 
pattern  y  =  [0  20001000000000  1]. 

We  introduce  a  partition  i/„,  tfj  of  the  state  space  H.  Given  an  observation 
y  we  want  to  decide  “matching”  (H i)  or  “not  matching”  {Hu).  We  will  do  so 
by  imposing  the  cost  function  : 

{0,  if  both  h,y  belong  to  either  Hu  or  H\ 

1,  if  h.y  do  not  belong  to  the  same  set  H y.  q  —  0. 1 
Note  that  Hi  is  not  necessarily  a  singleton.  This  means  we  may  consider  matching 
with  multiple  histogram  patterns  simultaneously. 

4.2.3  The  Transitions  Among  the  Histogram  Patterns 

A  block  alphabet  of  size  M  =  3  will  first  be  considered  and  afterwards  we 
will  generalize  our  result. 

Let’s  summarize  our  notation  and  introduce  some  new  one.  We  have  : 
h  :  Histogram  pattern,  h  <=  H\  e.g.  for  n  =  4  :  the  template  [1  0  2  2]  gives 
histogram  h  =  [112] 

h,.i  =  0.1.2  :  Number  of  occurrences  of  the  block  i  in  the  pattern,  e.g. 
/»„  =  1.  In  =  1.  h,  =  2 

y  :  Observation  pattern  :  it  has  the  same  format  as  h  and  is  subject  to 
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comparison  with  it. 

/  =0.1.2  :  Number  of  occurrences  of  the  block  i  in  the  observation  vector. 
f,j  :  Block  transition  probability  Pr{i-+jj. 

k,j  :  Number  of  (original)  blocks  i  in  h  “transformed”  (as  a  result  of  noise 
corruption)  into  blocks  j  in  the  observation  pattern  y. 

We  want  to  evaluate  the  histogram  transition  probabilities  : 

Phi  y)  =  ^[ho.hi.ii]]  ( [l/Ui  Vi  •  t)\i\ ) 

=  Pr{ y  is  composed  1>y  ijo  elements  (blocks)  of  tijpr  0. 

ij]  elements  of  type  1. 

y>  elements  of  type  2  / 

h  is  composed  by  by  elements  (blocks)  of  type  0,  . 

hi  elements  of  type  1. 

h  ,  elements  of  type  2  } 

x  the  size  of  the  histogram  class  of  y 

All  pairs  of  (  h,  y  )  can  be  represented  in  the  form  shown  in  figure  4.1. 

Note  that  : 

•  k,2  =  h,  —  k, ii  —  k,  |.  /  =  0. 1.2 

•  =  !h  -  k,u  -  /•'!,-  i  -  0.  1.2 

So  the  free  parameters  are  :  /.  mi  - /m  • m  - 1 1  ■ 
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Figure  4.1  Transition  probabilities  for  the  case  of  ternary  (M=3)  image 


Compute  the  probabilities  P„ .  Pb*Pr  :  The  number  of  combinations  of 

/''(jo  -  hi ,  hi  elements  of  type  0,  1,  2  respectively  in  places  is  : 

/'o  ~  h o 
hi 

For  a  specific  configuration  of  the  h0  first  elements  of  the  y-pattem  we  have  the 
transition  probability 

,^'00  ,^'0!  /'o  — 1'00  —  1'OJ 
Mill  MJ1 


Therefore  we  get 


Dll  1  \  (  \  (  ~  \  i 


kooJ'oi  ho  —  kM—kni 
Mil  MU  MIJ 


Similarly 


p,  , /.  /.  v  _  M'i  W  ^'i  ~  1/1 1"  \  i,„  p,  , —a a , , 

ftU'mi.  J'M  )  =  (  j  j  ('  j  ' V’  '' ■  1 

ich(  i  <  l  >u  —  //ii  —  /.'mi  —  A'|||.  A _•  |  =  i/i  —  A m  —  A  |  j . 
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We  can  now  compute  the  probability 


Pu(y)  =  E  P„(-)Pi,(-)Pc(  )  • 

VuUhUft]  * — 

(/  •/  1  'J  -  nil  nrrrptnlili’  roinhilliitioilf 

of  k|m ,k(u  .kiit.ku 

provided  we  have  the  ranges  for  the  elements  in  the  4-tuple  of  k's. 


From  the  figure  4.1  we  get  the  following  constraints  : 

0  <  /'no  4-  /'m  <  j/o  ( 1 1 

0  <  /'oi  +  A’n  <  t/i  (2) 

0  <  ( /*«  -  /'oo  —  /«oi )  +  ( k  l  —  /'to  —  ^‘i l )  2:  i/2  (3) 

0  <  A‘oo  +  A’oj  <  Ain  (“A) 

0  5:  A- io  +  A'n  <  />i  (*>) 

0  <  ( ijo  —  A:(iu  —  km )  +  (j/i  —  koi  —  A'u )  <  i)2  (C) 


The  inequalities  (1),  (2),  (4)  and  (5)  are  the  conditions  for  A-o.<.  />‘u -  /'_<o>  /'_»! 
respectively  to  be  positive  numbers.  The  inequalities  (3)  and  (6)  are  necessary 
for  A'22  to  be  a  positive  number  but  they  are  not  sufficient. 

We  also  have  to  impose  the  following  reasonable  constraint  : 

kij>  0.  /,./'  =  0,1.  (") 

We  have  : 

(2)  .(6)  =>  0  <  y»  —  /'no  —  k\ ii  <  h  >  =4  /'no  +  Z'w  >  t/n  —  h  >  (S) 


(l).(S)  =i-  /<'ro"  =  max  {//o  —  //_>.()}  S  k|in  4-  An i  2  //ii  (9) 

Similarly  we  also  get  : 

kj!j‘"  =  infix  {//]  —  //j.Uj  _  /.ii i  4-  /.|i  1_  // 1  i  19) 

=  max  { /in  —  //j. 0 }  <  /■ini  4-  /•  n i  2:  /'o  dll 

/•"""  =  max  { // 1  —  //.•.( >(  1  /'hi  +  Z'ii  2.  /'!  (12) 
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The  inequalities  (7)  and  (9)-(12)  constitute  a  set  of  necessary  conditions  for 
kj)  >  0.  i.  i  =  0. 1,2;  they  become  sufficient  if  the  induced  value  for  k>2  is 
checked  to  be  a  positive  integer.  In  the  figure  4.2  the  inequalities  are  pictorially 
summarized.  The  summations  of  the  row  elements  as  well  as  the  summations  of 
the  column  elements  have  to  lie  in  the  ranges  provided;  the  minimum  values  are 
given  up  and  left,  while  the  maximum  values  are  at  the  bottom  and  right. 


.min 

cO 

kT 

,  min 
kr0 

k  oo 

+ 

k01 

+ 

+ 

,  min 
k  rl 

\r 

10 

+ 

kll 

Figure  4.2  The  parameter  (k’s)  constraints  for  the  case  of  the  ternary  (M=3)  image 

Now  we  can  generalize  the  result  produced  for  the  special  case  of  alphabet 
size  M  =  3  to  hold  for  a  block  alphabet  of  arbitrary  size.  The  problem  amounts 
to  finding  the  transitions  probabilities 

Pu(  y )  =  Pr  {  h  =  [// 1  !':  •  •  •  !>„„}  —  }  =  ['/i  ‘  '  '  ■"»,]  ’  ■ 

where  h,  (or  ;/,)  are  the  numbers  of  occurrences  of  a  symbol  in  the  vector  h 
(or  y)  and  nh.n!l.  ( ;//,.//„  ^  M)  are  the  numbers  of  non  zero  bars  in  theh  and 

y  histograms. 
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First  we  will  find  the  probabilities  P,  ( • ) ,  i  =.  1, 2,  •  •  • ,  #//, ,  which  correspond 
to  the  probabilities  Pa  (• ) ,  Pb(-),  Pt  (•)  for  the  case  of  M  =  3.  As  a  direct 
extension  of  the  special  case  we  get  : 


Pi  ( /’i  1  ■  '  ’  ’  •  kin  —  1 )  — 


/'i  \  /  /»i  -  /•'«! 


/  «»-2  \ 

E  /m/ 

7  =  1 


/ 


»  </  -  1 


»ly  — 1 

'<•-  E  *•; 

w  ,  -1-1 

x  SI  e«2  c»h„ 


For  /  =?)/,,  i.e.  for  ( ■ ),  the  above  formula  holds  but  we  we  also  have 


kllhj  —  Vj  Z  *7  *  j  —  1.2.  •  •  •  .  Uy 

1=1 

and  therefore  P,ui)  depends  on  all  ,  i  =  1.2.***,  -  1.  j  = 

1,2, ••••77,  -  1. 

The  probability  ( y )  will  be  of  the  form  : 


A>  <  y 1 


77! 


vi  !y>!  •  •  ■  {/»J 


nil  nrroptnWe  j 
ky  I'lpl.-, 


The  necessary’  conditions  to  be  satisfied  from  the  k’s  are  summarized  in  the  figure 
4.3.  An  additional  condition  providing  sufficiency  is  that  the  induced 
element  be  a  positive  integer.6 


6  In  the  subsection  A.3  a  systematic  way  is  described  for  producing  the  acceptable  tuples  of 
k’s. 
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Figure  4.3  The  parameter  (k’s)  constraints  for  the 
case  of  multiple  valued  image  cells  (blocks  or  pixels) 


4.2.4  The  Decision  Rule 

Given  the  histogram  alphabet,  the  prior  probabilities  of  the  histogram  patterns 
(states),  the  transitions  among  them  (conditional  probabilities)  and  the  cost  func¬ 
tion  performing  the  desirable  grouping  of  the  states,  we  may  find  a  decision  rule 
which  minimizes  the  mean  cost  of  the  “matching"/'  non  matching"  decision.  The 
rule  d(.)  as  an  application  of  the  M-ary  hypothesis  testing  is  as  follows  : 

Let  y  be  the  observation  pattern,  /)(|  .  q  t  H  be  the  prior  probabilities  for 
the  histogram  patterns  and  P\,  |  v  | .  h.y  t  H  be  the  transition  probabilities. 
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Then  the  rule  is 


(  X>.,^.(y>-  '  =  ! 

f/(y)  =  •  **  fi’(y)  -  y'(y)=  \  1)(lp(l(y).  i  =  o  ' 

{  <ie//, 

We  can  numerically  determine  a  partition  {/?n,  /?j }  of//  (see  appendix)  such  that 

c/(y )  =  /  y  €  B,  ,  ?  =  0, 1  • 


We  recall  that 


i  f  tec  decide  matching 


d{y)  =  \ 

\  0  if  we  decide  non  matching 

Note  also  that  the  calculation  of  the  partition  {i?0.  i?i }  corresponds  to  the  calcu¬ 
lation  of  a  threshold  value  in  the  case  of  the  binary  block  alphabet  as  we  have 
seen  in  chapter  2. 


4.2.5  Performance  Evaluation  of  the  Decision  Rule 

We  will  calculate  the  (Pf„.Pj)  pair  for  our  rule. 


Probability  of  false  alarm  : 

Pfu  =  Pr  {d  €  Hx  |  h  G  Hu}  =  £  Pr  {d  =  i  |  li  €  Hn) 

i  €//■ 

Pr  {d  =  i  |  h  €  Hu)  =  E  Pr  {d  =  i  |  h  =  j}  Pr  {h  =  j  |  h  €  /M 

ie//« 

pfu  =  J]  V  Pr  {d  =  i  |  h  =  j }  P/  j  h  =  j  I  h  ^  //..  |  = 

'€  // 1 


Pf„  =  i  r  ill  =  j  in:  rii,  j  ^  r  '  j’-'  - 
je//»  ’€  Hi 

wht  re  Pr  {d  =  i  ]  h  =  j  j  =  P\  (  v 

y  A’; 
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for  some  decision  region  /?,.  So 


pf"  =  -yj.  E «  E  E  pj(y)  =* 

lew,,  je//"  ie//'yeWi 

^  E  n  E  pi<5’1  • 

Similarly  the  probability  of  miss  P,„  is  found  to  be 


P"’  ~  v,„E  w  E  Pi(y) 


i6//i  yeW,, 


and  as  usually  the  probability  of  detection  Pj  is  Pj  =  1  -  P,„  . 


In  the  plot  9  the  Receiver  Operating  Characteristic  (ROC)  curve  is  given  for 
various  values  of  the  inversion  probability  e  in  the  original  image  data.  The  curves 
are  drawn  for  the  2-block  template  shown  in  figure  5.1.  The  subset  of  the  first 
four  elements  of  the  modified  Hadamard  basis  was  used  as  the  block  codebook. 


We  observe  that  these  ROC  curves  are  much  closer  to  the  point  (Pfa=0,  Pd=l) 
than  the  ROC  curves  we  already  have  seen  (plots  2  and  4);  this  is  surprising  since 
in  those  tests  we  assumed  we  knew  the  background  of  the  target  object  while  here 
we  do  not.  Certainly  the  coding  procedure  we  used  may  have  contributed  to  the 
amelioration  of  the  decision  rule  but  still  this  does  not  seem  to  be  a  satisfactory 
explanation.  We  can  find  an  answer  for  this  question  if  we  think  that  here  the 
detection  region  Hi  is  a  singleton  which  contains  the  target  object  itself,  while  in 
the  previous  tests  it  was  containing  the  target  object  (the  black  square)  as  well  as 
all  the  patterns  representing  part  of  it  in  white  background.  For  example  the  four 
patterns  containing  only  one  black  pixel  of  a  comer  of  the  black  square  belong 
in  Hi  and  contribute  to  the  probability  of  miss  in  case  they  are  not  detected.  On 
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the  other  hand  the  fact  that  such  patterns  belong  in  Hi  implies  a  low  decision 
threshold  y0,  which  forces  Pfa  to  augment. 

In  plot  10  the  Pfa — Pa  curve  (for  the  optimal  Bayesian  rules)  parametrized 
by  the  pixel  inversion  probability  in  the  original  image  is  given.  The  modified 
Hadamard  basis  is  used  for  the  coding  of  the  image  data;  the  curve  is  drawn 
for  the  2  x 2-block  full  black  template.  Further  discussion  on  the  comparison  of 
decision  rules  is  made  in  subsections  5.1  and  5.2. 


4.3  Summary 

In  this  chapter  we  introduced  the  notion  of  the  histogram  and  discused 
the  histogram  alteration  as  a  result  of  the  alteration  of  bar  elements.  We  also 
developed  a  decision  rule  for  inference  about  the  matching  of  such  histograms. 
Our  rule  actually  resides  on  an  M-ary  hypothesis  test  and  is  optimal  in  the  sense 
of  minimizing  the  probability  of  taking  an  erroneous  decision.  The  performance 
evaluation  characteristics  of  this  rule  can  be  computed  numerically.  Sample  results 
are  given. 

In  the  next  chapter  we  will  comment  on  the  discrimination  power  of  the 
histogram  matching  test  and  compare  it  with  the  template  matching  test  in  terms 
of  complexity  and  reliability. 
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Chapter  5 
piscussion 

5  i  Review  :  Detection  in  Background 

Several  instances  of  the  two  dimensional  matching  problem  have  been  studied. 
fXsf  intention  was  to  show  the  dependence  of  the  detection  capability  on  the  noise 
that  the  image  suffers  and  the  effect  of  quantization  on  the  image  data. 
Throughout  all  the  preceding  analysis  we  primarily  concentrated  on  the  8  x  8-pixel 
!  Hack  template  as  the  prototype  geometrical  object  to  be  recognized.  In  chapter 
;  *c  first  considered  a  binary  uncoded  image  corrupted  by  AWG  noise.  In  that 
Vr  we  knew  we  had  white  background  and  therefore  took  advantage  of  the 
i  priori  known  possible  positions  of  the  image  window  scanning  the  image  in 
relation  with  the  target  black  square.  This  background  knowledge  gave  rise  to  an 
M  ary  test,  for  which  the  distinct  states  represented  the  possible  image  window 
p'ornonings  (see  fig.  2.1)  i.e. 

i  completely  white  :  view  the  background 
•  completely  black  :  view  the  target  square 

partially  black  :  view  the  neighborhood  to  the  target  square  pattern. 

Afterwards  we  grouped  together  the  2—  and  3— type  states,  forming  thus  the  binary' 
which  we  eventually  treated  in  a  Bayesian  framework.  We  gave  an  insight  on 
ho*  the  noise  affects  the  decision  making  (see  figure  2.2  and  plot  1).  The  ROC 
•he  test  was  also  given  for  noise  variance  equal  to  rr*=0.1  (plot  2). 


A  similar  formulation  was  presented  for  the  case  of  the  detection  in  back¬ 
ground  with  BSC  noise.  The  noise  effect  was  studied  (plot  3)  and  the  ROC  curve 
for  pixel  inversion  probability  f=0.1  was  given  (plot  4).  We  will  use  the  notation 
(2,64)-KB  to  characterize  the  test  for  matching  a  64-element  pattern  of  2-val¬ 
ued  elements  (bits)  in  known  background.  In  both  cases  above  the  sum-of-pixels 
statistic  (equivalent  to  the  Hamming  weight  for  the  BSC  noise  case)  was  used. 
This  essentially  means  that  we  do  not  take  into  account  the  positions  of  the  pixels; 
we  simply  count  the  white  and  the  black  ones  and  use  this  information  for  our 
test.  Evidently  this  affects  the  reliability  of  the  test  when  the  position  information 
for  the  template  carries  important  additional  information;  this  will  be  discussed 
in  the  next  subsection.  Note  though  that  position  information  is  assumed  to  be  of 
no  interest  for  the  case  of  the  full  black  template. 

A  simple  way  of  compacting  the  image  data,  the  coarsening  of  resolution,  is 
presented  at  the  end  of  chapter  1.  Based  on  the  coded  data  a  rule  for  detection  in 
background  is  developed  for  which  the  8  x  8-pixel  template  of  the  previous  two 
cases  is  replaced  by  a  4x4— block  template  of  2  x  2-pixel  blocks.  This  detection 
rule  turns  out  to  be  more  robust  to  the  noise  than  the  rule  based  on  uncoded 
data,  provided  we  do  not  have  much  noise  (compare  the  plots  3  and  5  for  pixel 
inversion  probability  e<0.05).  The  result  is  reversed  for  large  values  of  e,  since  the 
Pfa(f) — Pj(f)  curve  for  the  uncoded  case  is  higher  and  closer  to  the  Pj  axis.  This 
observation  is  further  validated  by  the  ROC  curves  for  ^=0.1  (plot  11  :  (2,16)- 
KB  test  better)  and  f=0.3  (plot  12  :  (2,64)-KJB  test  better).  We  will  characterize 
this  test  as  a  (2,16)-KB  test  since  it  matches  a  16-element  pattern  of  2-valued 
elements. 
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In  chapter  4  we  developed  another  test  acting  On  coded  image  data.  This  time 
we  used  a  modification  of  the  4x4  Hadamard  basis  to  code  the  image  data  (see 
fig.  3.1).  The  detection  of  the  black  square  in  white  background  eventually  is 
equivalent  to  the  previous  scheme,  since  we  have  zero  prior  probabilities  for  all 
but  the  first  two  elements  of  the  base;  though  here  the  original  8  x  8-pixel  template 
is  treated  as  a  2x2-block  template  of  4x4-pixel  blocks,  i.e.  we  have  an  even 
coarser  quantization.  The  ROC  for  this  test,  which  is  treated  as  a  (2,4)-KB  test, 
is  rather  worse  than  the  one  for  the  (2,16)-KB  test,  while  the  optimal  Bayesian 
rules  (plot  7)  give  considerably  lower  Pd  values  from  the  previous  tests  even  for 
6=0.1  (see  plot  8). 

In  plots  13  and  14  a  comparison  of  these  three  tests  acting  on  data  with 
different  resolutions  is  attempted.  It  is  apparent  that  at  certain  noise  level  the 
compression  of  data  favors  detection  capability,  while  for  higher  noise  level 
better  detection  results  are  obtained  by  a  rule  based  on  the  uncoded  data.  More 
specifically  in  plot  14  the  (2,16)-KB  test  has  higher  Pd  at  three  distinct  intervals 
of  6  (around  the  points  6=0.02,  6=0.14  and  6=0.22).  Also  the  (2,4)-KB  test  is 
superior  to  the  other  two  tests  for  heavily  noise  corrupted  data  (6>0.15).  In  plot 
13  however  it  becomes  evident  that  these  superiorities  of  the  coarsened  resolution 
data  always  are  penalized  with  peaks  of  the  f — Pf„  curve.  Nevertheless,  note  that 
for  6<0.05  the  (2,16)-KB  test  performs  better  than  the  (2.64VKB  test  in  terms  of 
both  Pfa  and  Pd. 
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5.2  Review  :  Detection  Without  Known  Background 

When  we  are  looking  for  an  object  in  an  image  we  usually  do  not  have  the 
luxury  of  knowing  what  we  expect  to  find  around  it  and  also  we  do  not  look 
only  for  such  a  uniform  object  as  the  full  black  template  is.  These  facts  led  us 
to  the  modification  of  the  Hadamard  basis  (in  chapter  3)  and  to  the  general  type 
( M,n)-UB  test  acting  on  n-element  patterns  of  M-valued  elements  (in  chapter  4) 
in  unknown  background.  This  type  of  test  relies  on  the  histogram  statistic,  which 
is  the  generalization  of  the  sum-of-pixels  statistic  for  the  case  of  the  M-valued 
elementary  image  data. 

In  this  subsection  we  will  see  the  results  of  two  kinds  of  experiments.  In  the 
first  one  the  image  elements  in  the  (M,n)-UB  test  are  the  codewords  of  the  modified 
Hadamard  basis  or  some  subset  of  this.  So  we  will  continue  the  discussion  on 
the  noise  effects  as  opposed  to  the  quantization  effects  on  the  detection  capability 
in  the  more  general  and  practical  case  where  we  have  more  than  two  distinct 
codewords.  The  resulting  tests  will  usually  have  M»n.  In  the  second  kind  of 
experiments  we  will  have  multiple  valued  pixels  which  may  be  thought  as  multiple 
gray  level  pixels  or  multiple  color  pixels.  The  experiments  are  relevant  to  the 
classification  of  patterns  according  to  their  color.  The  resulting  tests  usually 
have  M«n.  In  both  types  of  tests  we  assume  we  have  no  knowledge  of  the 
background.  This  means  that  : 

1.  When  scanning  the  image  we  cannot  take  advantage  of  the  windows  viewing 
the  target  partially;  note  that  this  was  the  reason  that  induced  the  M-ary 
hypothesis  test  for  the  full  black  template  case  in  the  ( 2,n)-KB  type  test. 
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2.  We  need  to  know  the  prior  probabilities  of  the  image  elements  we  use.For  the 
case  of  the  image  elements  which  are  codewords  we  derived  these  probabilities 
based  on  statistical  results  given  in  [LaS171].  Therefore  the  results  of  this 
analysis  apply  for  searching  in  natural  (not  synthetic)  images  as  well.  For  the 
case  of  multiple  valued  pixels  we  arbitrarily  assume  uniform  distribution. 

Remember  that  in  the  (M,n)-UB  test  we  have  to  match  an  "-element  pattern  of 
M-valued  elements.  We  assume  that  only  the  number  of  the  distinct  element 
values  and  not  the  specific  positions  is  important.  Consequently  we  have  to 
match  a  histogram  of  M  bins  whose  bars  sum  up  to  n.  Each  such  distinct 
histogram  constitutes  a  hypothesis  of  the  M-ary  test.  Recal  that  we  will  have 
S(M,n)  =  such  hypotheses.  We  may  define  one  or  more  of  them  as 

target  histograms  and  group  the  hypotheses  into  two  parts.  The  resulting  binary 
test  distinguishing  these  two  sets  is  what  we  call  the  (M,n)-UB  type  test. 


Figure  5.1  A  sample  template 


Consider  the  (2,4)  test  we  mentioned  in  the  previous  subsection  with  the 
difference  that  now  we  have  no  knowledge  about  the  background.  In  plot  10 
the  noise  effect  is  shown  for  the  corresponding  (16.4)-UB  test.  Note  how  the 
performance  of  the  test  worsens  as  the  pixel  inversion  probability  t  in  the  original 


74 


image  ranges  from  0.05  to  0.35.  For  6=0.35  the  optimal  Bayesian  rule  is  “never 
decide  square”.7 

Consider  the  template  of  the  Hgure  5.1.  A  (4,2)-UB  test  for  this  template  has 
been  studied;  the  ROC  curve  for  a  number  of  different  values  of  e  is  given  (see  plot 
9).  The  first  four  elements  in  the  modified  Hadamard  basis  with  normalized  prior 
probabilities  are  used.  Note  that  although  the  blocks  constituting  the  template  are 
selected  to  be  the  ones  least  a  priori  favored,  the  test  is  quite  reliable  (the  ROC  is 
close  to  the  Pd  axis)  even  for  6=0.4.  This  was  discussed  in  the  subsection  4.2.5. 

5.3  The  Code-Corrupt-Detect  System  and  the 
Multiple  Gray  Level  Images 

In  the  model  we  studied  so  far  the  noise  process  affects  the  uncoded  image 
and  the  noise  effect  propagates  in  the  coded  version  of  the  image.  In  this  way  we 
can  exploit  the  effect  of  the  noise  generated  at  the  moment  we  receive  the  image 
and  thus  this  model  may  be  called  “the  corrupt-code-detect  system”. 

A  related  problem  is  that  of  introducing  the  noise  after  the  image  has  been 
coded;  this  is  the  case  of  the  degradation  of  an  image  during  the  transmission 
of  the  coded  version  of  it  over  a  noisy  channel.  For  the  case  of  the  modified 
Hadamard  basis  we  use  the  source  coding  shown  in  table  5.1. 

This  specific  code  is  selected  because  it  retains  the  low  transition  probabilities 
between  the  elements  1  and  2,  3  and  4,  e  t  c.  in  the  coded  version  of  the  image. 

7  For  this  test  the  ROC  cannot  be  computed  due  to  the  memory  requirements  for  such  a 
computation.  Notice  that  the  decision  region  which  is  recursively  computed  must  range  from  0 
to  S(16,4)=3876. 
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the  plots  15-23  results  similar  to  .he  ones  found  for  .he  code-cotrupt-detec, 
case  are  illustrated.  It  is  worthwhile  to  note  that 

.  The  optimal  Bayesian  rules  based  on  the  pixel  data  always  perform  better 
(have  higher  probability  of  detection)  than  the  rules  based  on  the  coded  data. 
.  However  there  is  a  small  range  for  e  (close  to  (=0.06)  where  the  (2,4)-KB 
test  with  compression  rate  1/16  bpp  performs  better  than  both  the  <2,16)-KB 
and  (16,4)-UB  tests  with  compression  rate  1/4  bpp. 


Table  5.1  Source  coding  for  the  modified  Hadamard  bash  elements 


In  order  to  study  the  hrstogram  matching  problem  in  the  case  of  multiple  gray 
level  images  now,  we  need  to  define  a  pixel  transition  matrix,  for  instance  the 


one  in 


table  5.2  for  the  case  of  3  gray  levels. 
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e  =  noise  parameter 


0 

1 

0 

1  -  e  -  e2 

e 

1 

e 

1  -£- 

2 

e2 

8 

e 

1  -£-E: 


Table  5.2  Ternary  pixel  transition  probability  matrix 


The  prior  probability  mass  function  as  already  discussed  is  assumed  to  be 
uniform.  The  Pfa — Pa  curve  parametrized  by  e  for  the  optimal  Bayesian  rule, 
concerning  the  test  of  2 x 3-pixel  template  of  3-valued  pixels,  i.e.  the  (3,6)-UB 
test  is  given  in  plot  24. 


5.4  The  Bayesian  Approach  with  “Position  Independent” 
Statistics  v.s.  the  Bayesian  and  Maximum  Likelihood 
Approaches  with  “Position  Dependent  Statistics 

The  tests  we  have  examined  so  far  rely  on  the  sum-of-pixels  or  the  histogram 
statistic.  As  already  indicated  the  common  characteristic  of  these  two  statistics  is 
that  they  do  not  take  into  consideration  the  position  information,  as  other  statistics 
like  the  correlation  statistic,  do.  We  will  call  position  independent  the  tests  or  rules 
related  to  the  first  kind  of  statistic  and  position  dependent  the  tests  or  rules  related 
to  the  second  kind  of  statistic.  In  this  subsection  we  will  see  how  these  statistics 
compare  with  the  position  depended  ones  in  terms  of  computational  complexity 
and  the  reliability  of  the  related  tests. 
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Consider  the  case  of  a  dxd-element  template  (note  :  d2=n)  of  M-valued 
elements.  The  set  of  possible  templates  is  of  size  HP1  =  Mv  (for  instance  d=8, 
M=2,  Mn  «  1.8  •  10ly  and  d=2,  M=16,  Mn=65536)  which  reflects  the  size  of 
the  state  space  of  a  position  dependent  test.  As  we  have  seen  in  chapters  2  and 
4  the  template  size  determines  the  computational  complexity  for  computing  the 
Bayesian  rule,  as  well  as  the  complexity  and  memory  requirements  for  evaluating 
the  rule.  Consequently  for  problems  where  from  a  practical  point  of  view  it  is  not 
desirable  to  develop  Bayesian  rules,  we  usually  prefer  the  maximum  likelihood 
(ML)  approach  (see  subsection  1.2)  which  gives  reasonable  results.  Still  both  ML 
and  Bayesian  position  dependent  rules  imply  computationally  expensive  matching 
tests,  since  at  each  time  instant  of  the  image  scanning  they  need  0(n2)  operations. 

On  the  other  hand  the  state  space  size  of  the  position  independent  tests  is 
S(M.n)  —  ^Af+n-i^  (f°r  instance  d=8,  M=2,  S(M,n)=65  and  d=2,  M=16, 
S(M,n)=3876),  which  is  considerably  smaller  from  the  one  of  the  other  Bayesian 
rules8.  So  the  histogram  related  rules  as  opposed  to  the  position  dependent 
Bayesian  rules,  turn  out  to  be  more  computationally  feasible  for  a  wider  range 
of  templates. 


*  By  using  Stirling’s  approximation  for  the  factorials  one 
case  of  M»n  we  have 

S  (  M .  II) _  I  <  \r 


can  easily  show  that  for  the  hard 
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Figure  5.2  One  time  instant  sliding  of  the  scanning  image  window 


Furthermore,  the  matching  test  requires  0(min(max(d,M)42))  operations,  as 
opposed  to  the  0(d2)  operations  for  both  ML  and  Bayesian  position  dependent 
rules  :  Consider  two  consequent  positions  X*,  X,+1  of  the  dxd  window  scan¬ 
ning  the  image  (see  fig.  5.2).  Let  liy,  hj.  •  •  • ,  hj-i  and  hi.h2,  ■  •  • .  h,i  repre¬ 
sent  the  histograms  of  the  elements  of  the  column  vectors  xo.  xi .  •  •  • ,  xj_  i  and 
X1.X2.---.XJ  of  X‘  and  Xt+1.The  values  of  the  histogram  statistics  for  the  two 
consequent  positions  are 

h1  =  hit  +  hi  +  -  •  •  +  I’  l-i •  l»l+ 1  =  hi  4-  !»■_•  +  •  •  +  I'  l  • 

note  that  h,  +  l  =  h'  4-  h  i  -  h,..  which  requires  0(d)  operations.  Afterwards  we 
have  to  compare  the  M-vector  h,+  I  with  the  template  histogram.  This  is  an  O(M) 
operation;  but  since  only  the  non  zero  elements  of  the  M-vector  count  and  these 
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cannot  be  more  than  d2,  the  overall  test  requires  0(min(max(d,M),d2 ))  operations. 
Note  though  that  this  is  a  conservative  estimate;  we  expect  that  the  complexity 
will  practically  be  reduced  to  0(d)  because  of  the  smoothness  of  the  image,  but 
this  certainly  requires  simulation  verification. 

The  penalty  for  the  reduced  complexity  of  the  position  independent  statistics 
is  a  higher  probability  of  false  alarm  compared  to  that  of  the  position  dependent 
ones.  Let  us  suppose  that  among  the  d2=n  elements  of  X*  we  have  k  distinct  ones 
namely  no  of  type  0,  n\  of  type  1,  ...,  n*  of  type  k,  so  that 

+  n  j  +  •••-+-  tif;  —  n  . 

This  histogram  corresponds  to 


possible  template  patterns.  Suppose  that  among  these  c  =  c  ( n )  patterns  is  the 
one  we  want  to  detect.  Ideally  our  rule  will  respond  positively  each  time  it  scans 
one  of  these  patterns.  Consequently,  each  time  we  scan  our  target  pattern,  the 
rule  will  recognize  it;  but  it  will  recognize  all  the  c  —  1  patterns  sharing  the  same 
histogram  with  our  target  as  well.  So  if  we  denote  by  h.  Pfa ,  Pj.  />.  Pj„ .  P,i  the 
state  and  performance  evaluation  parameters  for  the  position  independent  test  and 
the  position  depended  test  respectively,  we  expect  by  intuition  that 

r,i  =  P,i  a  ml  Pi ,,  =  P i  „  +  ( '•  —  1 1  P,i  ■ 

We  shall  see  though  that  things  are  slightly  different. 

Let  us  denote  with  d  the  decision  outcome  0  or  1  from  now  on.  In  figure  5.3 
we  can  see  the  relations  of  the  regions  characterized  by  the  values  of  h .  h  and  d. 
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Figure  5.3  Comparison  of  the  performance  characteristics 
for  the  position  dependent  and  position  independent  tests. 


We  know  make  the  following  assumption  :  Given  that  an  image  element  lies 
in  a  template  the  probability  mass  function  of  the  random  variable  indicating  its 
position  and  ranging  over  all  the  allowable  positions  is  a  uniform  pmf.  In  other 
words  if  c(n)  distinct  patterns  result  in  the  same  histogram  n  (and  no  other  does), 
given  the  histogram  n  any  pattern  resulting  it  may  occur  with  probability  l/c(n). 

From  the  figure  5.3  we  induce  that  the  above  assumption  implies  : 

Pr  {/;  =  l}  =  jP'  {h  =  1}  (1) 

Pr  {(/  =  l.h  =  l}  =  -Fr  \J  =  l.h  =  l).  (2* 
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as  it  was  expected.  We  also  have  : 

Pr  {/>  =  0}  =  Pr  {h  =  0}  +  Pr  {/,  =  1  .h  =  ()} 

=  Pr  {h  =  0)  +  Pr  {h  =  1}  -  Pr  {/,  =  l} 

=  Pr  {h  =  0)  +  Pr  {h  =  1}  -  ~cPr  {h  =  1} 

=  Pr{h  =0}  +  Pr  V1  =  !) 

and  similarly 


Pr  jf/  =  1.  h  =  0}  =  Pr  {rl=  l.h  =0}  +  (l  -  ^  Pr{(1=  l.h  =  1}  . 


So 


P} a  =  Pr  {rf  =  1  i  h  =  0}  = 


Pr  {W  =  1.  /)  =  0}  4-  (1  —  7)  Pr  {f/  =  1.  /?  =  1} 

Pr  {h  =  0}  +  (1  —  ~:)  Pr  {h  —  lj 

(\Pf„  4-  lIP'i  1  j  _  1  ~  7 

-  «  +  ,,}  •  a~  pr{h  =  1}  ’  Pr  {It  =  0} 

This  is  a  convex  combination  of  Pfa  and  P<j,  which  is  not  exactly  what  we 
were  expecting  to  find.  We  may  observe  that  as  n  gets  larger  c  gets  larger  and 
consequently  gets  larger,  so  Pf„  gets  larger.  Note  also  that  the  prior  knowledge 
Pr  {/)  =  /}  .  i  =  0, 1,  explicitly  affects  ■  For  the  full  black  template  we 
have  c(n)=l ;  so  for  equal  priors  we  have  Pi.  =  Pi.  .  This  result  holds  for  the 
all-white  template  as  well.  The  Pfa  probability  for  optimal  Bayesian  rules  as  a 
function  of  the  noise  parameter  f  is  given  in  plot  18. 


Thus  our  position  independent  tests  : 


•  Are  faster  but  tend  to  have  higher  probability  of  false  alarm  than  both  ML 
and  Bayesian  position  dependent  tests. 
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•  Accept  analytic  performance  evaluation  and  are  more  powerful  than  the  ML 
position  dependent  tests,  since  they  have  the  same  power  with  the  Bayesian 
position  dependent  tests. 

•  Are  more  attractive  from  the  computational  complexity  point  of  view  than 
the  position  dependent  Bayesian  tests,  but  still  not  adequately  attractive  for 
non  trivial  templates. 

•  Are  equivalent  with  the  Bayesian  position  dependent  tests,  in  terms  of  the  Pfa 
and  Pa  characteristics,  for  the  case  of  the  full  black  and  all-white  templates, 
under  mild  conditions. 


5.5  Overview 

The  template  matching  problem  for  noise  corrupted  binary  images  was  con¬ 
sidered.  We  attempted  a  Bayesian  formulation  of  the  problem  that  resulted  in 
the  use  of  simplifying  statistics.  In  this  way  the  complexity  of  determining  the 
optimal  Bayesian  rule  is  reduced  at  the  cost  of  a  higher  false  alarm  rate.  Match¬ 
ing  rules  for  the  original  pixel  image,  as  well  as  data  of  a  coarser  resolution  and 
data  coded  by  a  modification  of  the  Hadamard  basis  were  developed.  These  rules 
gave  an  intuition  of  how  the  data  compression  and  the  noise  affect  the  detection 
capability.  Our  conclusion  is  that  a  certain  amount  of  noise  is  “killed”  by  ap¬ 
propriate  data  compression.  Another  issue  studied  was  the  knowledge  about  the 
background  surrounding  the  target  object;  such  knowledge  was  shown  to  enhance 
the  reliability  of  the  matching  rule. 
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Practically  in  “Bayesian  template  matching”,  given  a  target  template  (or  set 
of  target  templates)  and  certain  level  of  noise,  we  should  : 

1  Determine  the  set  of  patterns  for  which  we  decide  “matching  ;  this  is  in 

general  a  very  time  consuming  process. 

2.  Given  this  set  the  search  for  an  object  is  reduced  to  searching  in  the  coded 

image,  and  checking  if  the  pattern  lies  in  our  set. 

An  application  of  this  process  could  be  characterized  as  “searching  in  a  bank 
of  images”,  for  a  given  object.  Since  we  know  that  we  suffer  a  high  false  alarm 
rate  but  the  probability  of  detection  is  still  high,  we  can  precede  the  first  selection 
of  candidate  matching  patterns  by  an  ML  position  dependent  test  (see  subsection 
1.2);  this  second  test  may  be  slower  but  more  reliable  (lower  Pfa). 

From  what  we  have  seen  so  far  our  approach  has  a  major  weakness:  We 
restrict  ourselves  to  small  templates  (of  size  n=4,  or  8  x  8-pixel)  of  binary  images, 
i.e.  we  can  describe  just  a  small  class  of  possible  target  objects.  This  happens 
because  optimal  Bayesian  rules  require  computations  of  mean  values  over  all 
the  possible  patterns,  which  run  up  to  M"  for  a  matching  rule  based  on  position 

depended  statistics  and  up  to  S(.V.n)  =  (J'T')  for  *e  mJe  based  m  ,he 
histogram  statistic.  For  instance  1C'1  ~  0.9  •  10|(l,  while  S  ( 1G.  9 )  ~  1.3  •  10  for 
n=9.  Consequently  the  object  we  are  looking  for  must  belong  to  some  restricted 
class  of  such  objects.  An  example  of  such  a  class  is  found  by  Knudson;  in  [Knu75] 
he  shows  experimentally  that  only  62  out  of  2M  8  x  8  binary  block  patterns  suffice 
to  give  a  good  quality  for  newspaper  text  and  graphics.  A  matching  test  for 
8x8-pixel  templates  can  be  implemented  with  a  codebook  of  size  S(62,l)=62.  A 
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test  for  a  2x2-block  template  of  8  x 8-pixel  blocks  with  a  template  codebook  of 
size  S(62,4)=677,040  is  feasible  as  well.  (Note  that  the  corresponding  position 
dependent  test  has  a  codebook  of  size  larger  than  1 1  million  codewords). 

The  “bar  code”  patterns  may  constitute  some  other  class  of  objects  to  be 
identified.  Nevertheless,  the  use  of  a  Bayesian  matching  rule  is  worthwhile  only 
in  the  case  where  we  have  heavy  noise  corruption. 

An  attempt  to  expand  the  scope  of  the  Bayesian  template  matching  rules  in 
the  class  of  arbitrary  multiple  gray  level  objects  can  be  made  by  using  some 
efficient  coding  scheme  like  one  discussed  in  [TaFa89].  Compaction  rates  up  to 
0.25  bpp  are  achieved  by  using  subband  coding  [WoNe86];  it  is  shown  that  the 
lowest  frequency  subband  (LFS)  contains  the  most  important  data  needed  for  the 
reconstruction  of  the  image;  Huffman  code  is  used  to  encode  4  x  4-pixel  blocks 
in  each  subband.  Although  a  fixed  length  code  would  give  a  codebook  of  size 
24x4x0.25_i6  the  sjze  0f  the  codebook  used  for  the  variable  length  code  must  be 
much  larger.  The  compromise  we  have  to  make  amounts  to  : 

•  Use  only  the  data  of  the  lowest  frequency  subband. 

•  Use  the  M  most  frequent  block  codewords,  where  M  is  sufficiendy  large  to 
make  the  class  of  the  target  wide,  and  at  the  same  time  small  enough  to  make 
the  template  codebook  S(M,n)  small. 

•  The  size  of  the  template  in  blocks  is  expected  to  be  n-I .  since  Information 
Theoretic  results  imply  that  M  (or  Iog2M)  must  be  large  if  small  compaction 
distortion  is  to  be  obtained0. 

9  See  Shannon’s  first  coding  theorem  [B!aS7,  pp74] 
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An  application  of  the  matching  test  based  on  the  histogram  statistic  is  the 
detection  of  patterns  based  on  color  information.  This  rule  being  faster  than  the 
ordinary  template  matching,  reflects  the  human  ability  of  recognizing  the  color 
faster  than  the  shape  of  an  object  [Chr75]10. 

A  Bayesian  approach  which  is  not  as  straightforward  as  the  one  already  studied 
could  incorporate  some  probabilistic  model  for  the  image  (e.g.  an  Autoregressive 
model).  Under  this  treatment  the  matching  rule  could  be  formulated  as  a  sequential 
detection  rule,  thus  avoiding  the  large  state  space.  However  this  approach  requires 
the  assumption  of  stationarity  for  the  image  statistics,  a  situation  that  is  not 
realistic. 


10  Experimental  results  concerning  the  time  required  to  locate  color  targets  relative  to  the  time  for 
shape  targets  localization  are  given  in  figures  8-12.  It  is  (experimentally)  shown  that  identification 
based  on  color  information  is  faster  than  the  one  based  on  achromatic  information,  provided  the 
awareness  of  the  color  of  the  targets  (i.e.  in  our  formulation,  provided  we  have  a  known  target 
color  histogram). 
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Appendix  A  Algorithms 


A.1  The  Hamming  weight  of  the  mxm-pixel 
patterns  of  the  test  in  |2.1 

In  the  subsection  2.1  we  have  seen  that  apart  from  the  all-white  pattern  (2m- 
l)2  possible  patterns  for  the  test  window  are  considered.  For  the  special  case  of 
m=2  the  Hamming  weights  (i.e.  the  number  of  black  pixels  in  the  pattern)  for  the 
9  possible  patterns  (see  fig.  2.1)  are  given  in  figure  A. la  . 


symmetric  triangles 


Figure  A.l  Hamming  weights  for  some  pattern  samples 


The  weights  for  the  case  of  m=5  are  given  in  figure  A. lb.  One  can  observe 
that  all  distinct  weights  appear  in  the  triangle  “1-5-25’'.  This  property  holds  for 
the  general  case  of  an  arbitrary  value  of  m,  because  of  the  symmetric  overlaps 
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between  the  black  square  on  the  image  and  the  scanning  test  window.  So  we 
may  have  at  most 


m  +  ( w  -  1 )  +  ••  •  +  1  = 


in  ( ///  +  1 ) 
2 


distinct  weight  values.  This  leads  to  a  reduction  of  the  number  of  the  states  of 
the  corresponding  M-ary  test  from  (2m-l)2  +  1  to  m(m+l)/2  +  1  of  them,  i.e. 
a  reduction  by  a  factor  close  to  8.  The  set  S  of  all  possible  weights  we  may 
have  is  : 


duplications  in  S  can  be  exploited  for  fixed  values  of  m. 

If  we  are  interested  in  the  number  of  occurrences  of  each  weight  value  in  the 
set  of  the  (2m-l)2  patterns,  we  may  observe  that  we  have  one  time  the  weight  m2, 
4  times  the  weights  belonging  to  the  set  uyiTj2  { mj,  j 2 } ,  and  8  times  the  weights 
belonging  to  the  set  LI™/]2  11”!“+,  {?./} 


A.2  The  integer  partitions  of  a  nonnegative  integer  n 

The  problem  is  posed  as  follows:  Given  a  natural  integer  n  (i.e. 
n  t=  .V  =  {1.2.  •■•})  find  all  the  finite  sequences  of  the  form 

ii,/  'h  =  "■  "i  ^  ./ /,-s 

i 
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The  key  idea  is  as  follows11  :  We  consider  the  product: 

P  (x)  =  (1  +  oj.r  +  Oj.i +  •  •  •)  ^1  +  o  >.r“  +  o^.r4  +  ■  •  •  +  +  •  •  •  j  •  •  • 

+  ak-xk  +  o  •  •  • 

=  i  +  «,.r  +  (o;v=.-of +  •••)/ 

Observe  that  the  term 

which  appears  in  the  coefficient  of  x"  is  such  that  ??]  4-  2n  >  -(-•-•  +  &/<*.  =  »  and 
thus  it  determines  the  following  partition  of  rt  : 


n  -  k+k+...+k  + 

—  nk~- 


+  2+2+.. .+2  +  1 +1 + ...+1 


We  are  interested  in  the  sum-of-products  expansion  of  the  n-th  coefficient  in 
the  polynomial  P  ( .r )  =  a  i  ( x )  a  ■>  ( .r )  ■  ■  •  a  ( x )  ■  ■  ■  given  above.  Obviously 
this  coefficient  is  not  affected  by  : 

1.  any  multiplicand  o,(.r).  i  >  n 

2.  any  summands  in  at(x) .  i  <  n  with  order  greater  than  n. 

Therefore  we  can  eliminate  all  these  terms,  without  the  resulting  partitions  to  be 
affected. 


"  The  solution  we  propose  is  an  expansion  of  the  solution  proposed  in  [ToMe85,  ppl74]  for 
finding  the  number  of  the  integer  partitions  of  a  nonnegative  integer  n. 
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A.3  Solution  for  a  system  of  inequalities  in  the 
domain  of  nonnegative  integers 

We  have  the  following  system  of  inequalities  : 

c,min  <  ^*1  +  ‘  +  /<■**»  <  «niax  ’  1  =  1*  ‘  ‘  ‘  • m 

Km  <  +  Kl  - +  <  Kmx  '  J  ~  1'  '  •  '  ■  n 

where  all  variables  belong  in  the  set  {0,1,2,...}. 

The  algorithm  we  use  performs  an  exhaustive  search  for  the  solutions  of  the 
system  : 

1.  for  each  i  in 

for  each  a1  in  a‘ min, ...,  aU 

determine  the  set  of  integer  partitions  {k|},  k;  =  [/.,] .  ktl.  •  •  • ,  of  o' 
in  no  more  than  n  places 

and  all  permutations  of  these  partitions  in  n  places. 
comment  :  each  m-tuple  for  which  the  i-th  element  belongs  to  { k| }  satisfies  the 
first  set  of  inequalities,  and  no  other  m -tuples  do. 

2.  for  each  /n-tuple  constructed  as  described  in  the  comment  above 

check  if  the  second  set  of  inequalities  is  satisfied; 
if  yes  obtain  a  solution  [kx.k2.-  •  • .  km] 


A.4  Determining  the  decision  region  7  of  the  subsection  4.2.4 


We  want  to  determine  the  set  of  histograms 


I\\  =  <(  y  :  ./  I  y  )  =  ^  ( y )  -  Y]  ( y  )  _  0 

<|  £ll» 

=  {y  :  /  ( y )  =  5i  ( y )  -  5n  ( y )  >  U } 
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which  is  the  region  for  deciding  that  the  test  window  matches  with  the  given 
template. 

Consider  the  case  in  which  the  set  of  target  templates  H]  is  a  singleton 
Hi={yo}-  The  key  idea  is  that  the  condition  f(y)£0  can  be  satisfied  only  by 
the  points  close  to  yo;  in  other  words  Rj  will  be  a  neighborhood  of  yo-  Stepping 
away  from  yo  (the  given  template)  will  cause  f(y)  to  get  gradually  smaller  and 
at  some  point  to  become  negative.  Let  us  define  the  set  of  closest  neighbor  of  a 
vector  y  to  be  the  set  of  vectors  differing  at  only  one  coordinate  from  y.  Note 
that  if  y  has  n  coordinates  and  each  coordinate  may  take  one  of  M  possible  values 
then  y  has  n(M-l)  closest  neighbors. 

The  algorithm,  which  we  call  “stepping  process”,  functions  as  follows  : 

1.  Check  if  f(yo)^0.  If  not  we  have  the  trivial  case  of  the  identity  decision  rule 
d(y)=0  for  every  y;  exit  the  algo.  Else  “put  yo  in  Rj  and  mark  a  tag  on  it 
as  “unchecked”. 

2.  Check  which  of  the  closest  neighbors  of  yo  satisfy  the  condition  f(yo)>0  and 
put  them  in  Rj  with  their  tags  marked  as  “unchecked”. 

3.  Mark  yo’s  tag  as  “checked”. 

4.  For  the  “unchecked”  vectors  in  Rj  repeat  steps  2  and  3  until  all  vectors  in 
Rj  are  “checked”. 

Let  us  refine  the  second  step  of  this  process  :  Changing  one  of  the  coordinates 
of  yo,  say  from  y*=k  to  y'=l  has  the  following  effects: 

1.  The  yo  vector  is  transformed  into  some  other,  say  y*  vector  differing  only  at 
the  i-th  position  from  yo- 
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2.  The  Si(.)  term  in  f(.)  gets  smaller  (usually  by  several  orders  of  magnitude), 
since  Si(yo)  will  be  substituted  by 


S(yi)  -  <  Si  (y„)  . 

I  k  ( k ) 

where  P k(l)  and  P^(k)  are  the  block  transition  probabilities  for  which  we  have 
Pk(0<Pk(k)  (unless  we  have  a  pathological  case  of  noise). 

3.  the  S0(.)  term  of  f(.)  gets  larger  since  y!  <E  Hn  by  construction  and  the 
summand  in  S0(.)  corresponding  to  it  will  get  significantly  larger  while  the 
other  summands  will  approximately  retain  their  values. 

This  monotonicity  property  of  f(.)  implies  that  Rj  has  to  be  a  neighborhood  of 
yo-  The  algorithm  presented  has  a  recursive  nature  since  each  vector  entering  in 
the  Rj  decision  region  becomes  a  starting  point  whose  “closest”  neighbors  are 
to  be  checked. 


Figure  A. 2  The  flow  graph  of  the  “stepping  process" 
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The  flow  of  the  stepping  process  is  schematically  given  in  figure  A.2.  If  Hi 
is  not  a  singleton  we  must  repeat  the  proposed  algorithm  once  for  each  element 
of  Hi  resulting  to  the  decision  regions  R\ ,  /  =  0, ■ •  • ,  k.  The  decision  region  Rj 
will  be  the  union  of  them  R  \  =  R{{  U  R\  U  •  •  •  U  R\ .  An  one  step  instance  for 
the  special  case  of  M=4,  n=2  is  given  in  the  figure  A.3. 


i.e.  2  places:  2x0 
i.e.  2  places:  1  x  0,  1  x  1 

i.e.  2  places:  2x1 
e.t.c. 


2(4-1)  =  6  neighbors 


Figure  A.3  An  example  for  the  “stepping  process” 


Let  us  define  N1  to  be  the  set  of  closest  neighbors  of  yo,  N2  the  union  of  the 
closest  neighbor  sets  of  all  the  elements  in  N1  and  similarly  define  the  sets  N3, 
N4,  e.t.c.  The  ROC  plots  found  in  the  present  work  are  based  on  the  performance 
evaluation  of  a  sequence  of  tests  with  decision  regions  N1,  N2,  N3,  e.t.c.  Note  that 
the  recursive  nature  of  the  algorithm  presented  implies  vast  memory  requirements 
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when  the  decision  region  Rj  or  N*  is  relatively  large  (experimentally  found  that 
it  should  not  contain  more  than  100  elements),  that  poses  limitations  to  the  set  of 
tests  for  which  the  ROC  curves  can  be  obtained. 
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Appendix  B  Performance 
evaluation  plots _ 

The  results  given  concern  64  x  64— pixel  binary  images.  For  the  Gaussian  noise 
case  we  used  a  5  x  5-pixel  full  black  template,  while  for  the  BSC  noise  case  we 
used  an  8  x  8-pixel  (or  equivalent  coded)  full  black  template  except  otherwise 
stated. 

The  compression  rates  in  bits  per  pixel  (bpp)  for  the  tests  we  studied  are 
given  in  table  B.l 


test 

compression  rate  (bpp) 

(2,64) 

i 

(2,16) 

1/4 

(2,4) 

1/16 

(4,2) 

1/4 

(16,4) 

1/4 

(3,6) 

1 

Table  B.l  Compression  rates  for  the  tests  studied 


A  list  of  the  plots  is  given  below.  ROC  stands  for  Receiver  Operating 
Characteristic  curve,  KB  stands  for  “known  background”,  and'  UN  stands  for 
“unknown  background”.  Most  of  the  results  presented  are  based  on  the  analysis 
described  in  chapters  2  and  4.  The  simulation  results  are  produced  by  one  Monte 
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Carlo  run  for  which  the  optimal  thresholds  found  by  the  theoretical  analysis  are 
assumed  to  be  known. 

□  uncoded  image  data 

Plot  1:  Gaussian  noise;  Pd(cr2)-Pfa(cr2)  curve;  KB; 

Plot  2:  Gaussian  noise;  ROC,  o2=QA\  KB; 

Plot  3:  BSC  noise;  (2,64)-KB  test;  Pd(e)-Pfa(e)  curve; 

Plot  4:  BSC  noise;  (2,64)-KB  test;  ROC,  f=0.1; 

□  the  corrupt-code-detect  system 

Plot  5:  (2,16)-KB  test;  Pd(O-Pf«(<0  curve; 

Plot  6:  (2,16)-KB  test;  ROC,  6=0.1; 

Plot  7:  (2,4)-KB  test;  Pd(f)-Pfa(<0  curve;  equivalently:  (16,4)-KB  test; 

Plot  8:  (2,4)-KB  test;  ROC,  e=0.1;  equivalently:  (16,4)-KB  test; 

Plot  9:  (4,2)-UB  test;  synthetic  template  (see  fig.  5.1);  ROC,  6=0.05,  f=0.1, 
6=0.2,  6=0.3,  6=0.4; 

Plot  10:  (16,4)-UB  test;  Pd(e)-Pfa(f)  curve; 

□  comparative  plots 

Plot  11:  (2,64),  (2,16)-KB  tests;  ROC,  6=0.1; 

Plot  12:  (2,64),  (2,16)-KB  tests;  ROC,  6=0.3; 

Plot  13:  (2,64),  (2,16),  (2,4)-KB  (equiv.  (16.4)-KB).  (16.4)-UB  tests;  6— Pfa 
curve  for  optimal  Bayesian  tests: 

Plot  14:  (2,64),  (2,16),  (2,4)-KB  (equiv.  (16.4)-KB).  (16,4)-UB  tests;  6— Pd 
curve  for  optimal  Bayesian  tests; 
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□  the  code-corrupt-detect  system 


Plot  15:  (2,16)-KB  test;  Pd(f)-Pfa(0  curve; 

Plot  16:  (2,16)-KB  test;  ROC,  <==0.1; 

Plot  17:  (2,4)-KB  test;  Pd(f)-Pfa(0  curve; 

Plot  18:  (2,4)-KB  test;  ROC,  e=0.1; 

Plot  19:  (16,4)-KB  test;  Pd(f)-Pfa(0  curve; 

Plot  20:  (16,4)-KB  test;  ROC,  <==0.1; 

Plot  21:  (16,4)-UB  test;  Pd(f)-Pfa(0  curve; 

□  comparative  plots 

Plot  22:  (2,64),  (2,16),  (2,4)-KB  (equiv.  (16,4)-KB),  (16,4)-UB  tests;  f— Pfa 
curve  for  optimal  Bayesian  tests; 

Plot  23:  (2,64),  (2,16),  (2,4)-KB  (equiv.  (16,4)-KB),  (16,4)-UB  tests;  f~Pd 
curve  for  optimal  Bayesian  tests; 

□  multiple  gray  level  pixels 

Plot  24:  (3,6)-UB  test;  Pd(0-Pfa(0  curve. 
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ft 


102 


103 


105 


866 


107 


Pfa 
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Pfa 
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Plot  13:  epsilon-Pfa  graph 


(2,64)-KB  test 
(2,16)-KB  test 
(2,4)-KB,  (16,4)-KB  tests 


Ill 


Pfa 
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threshold  = 


theoretical  result 
simulated  result 


threshold  = 


lot  16  :  The  code-corrupt-detect  system: 
ROC  for  the  case  of  coarsening  the 
resolution;  (2,16)-KB  test;  epsilon=0.1 
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pixel  inversion  probability 
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